DNA expression systems based on alphaviruses

ABSTRACT

The disclosure describes recombinant alphavirus RNA molecules and expression of heterologous proteins therefrom in animal cells. Recombinant alphaviruses of the present invention, when made to express an antigenic protein, can be administered as vaccines.

The present invention is related to DNA expression systems based on alphaviruses, which systems can be used to transform animal cells for use in the production of desired products, such as proteins and vaccines, in high yields.

The rapid development of biotechnology is to a large extent due to the introduction of recombinant DNA technique, which has revolutionized cellbiological and medical research by opening new approaches to elucidate the molecular mechanisms of the cell. With the aid of the techniques of cDNA cloning, large numbers of interesting protein molecules are characterized each year. Therefore, a lot of research activity is today directed to elucidate the relationship between structure and function of these molecules. Eventually this knowledge will increase our possibilities to preserve healthiness and combat diseases in both humans and animals. Indeed, there is today a growing list of new "cloned" protein products that are already used as pharmaceuticals or diagnostics.

In the recombinant DNA approaches to study biological questions, DNA expression systems are crucial elements. Thus, efficient DNA expression systems, which are simple and safe to use, give high yields of the desired product and can be used in a variety of host cells, especially also in mammalian cells, are in great demand.

Many attempts have been made to develop DNA expression systems, which fulfill these requirements. Often, viruses have been used as a source of such systems. However, up to date none of the existing vital expression systems fulfill all these requirements in a satisfying way. For instance, the Baculovirus expression system for cDNA is extremely efficient but can be used only in insect cells (see Reference 1 of the list of cited references; for the sake of convenience, in the following the cited references are only identified by the number they have on said list). As many important molecules will have to be produced and processed in cells of mammalian origin in order for them to become active, this system cannot be used in such cases. Furthermore, the Baculovirus cDNA expression system is not practically convenient for analysis of the relationship between structure and function of a protein because this involves in general the analysis of whole series of mutant variants. Today it takes about 6-8 weeks to construct a single Baculo recombinant virus for phenotype analyses. This latter problem is also true for the rather efficient Vaccinia recombinant virus and other contemporary recombinant virus cDNA expression systems (2,3). The procedure to establish stably transformed cell lines is also a very laborious procedure, and in addition, often combined with very low levels of protein expression.

Hitherto, most attempts to develop viral DNA expression systems have been based on viruses having DNA genomes or retroviruses, the replicative intermediate of the latter being double stranded DNA.

Recently, however, also viruses comprising RNA genomes have been used to develop DNA expression systems.

In EP 0 194 809 RNA transformation vectors derived from (+) strand RNA viruses are disclosed which comprise capped viral RNA that has been modified by insertion of exogenous RNA into a region non-essential for replication of said virus RNA genome. These vectors are used for expression of the function of said exogenous RNA in cells transformed therewith. The RNA can be used in solution or packaged into capsids. Furthermore, this RNA can be used to generate new cells having new functions, i.e. protein expression. The invention of said reference is generally claimed as regards host cells, (+) strand RNA viruses and the like. Nevertheless, it is obvious from the experimental support provided therein that only plant cells have been transformed and in addition only Bromo Mosaic virus, a plant virus, has been used as transformation vector.

Although it is stated in said reference that it is readily apparent to those skilled in the art to convert any RNA virus-cell system to a useful expression system for exogenous DNA using principals described in the reference, this has not been proven to be true in at least the case of animal cell RNA viruses. The reasons for this seem to be several. These include:

1) Inefficiencies in transfecting animal cells with in vitro transcribed RNA;

2) Inefficiency of apparently replication competent RNA transcripts to start RNA replication after commonly used transfection procedures;

3) The inability to produce high titre stocks of recombinant virus that does not contain any helper virus;

4) The inability to establish stable traits of transformed cells expressing the function of the exogenous RNA.

In Proc. Natl. Acad. Sci. USA, Vol 84, 1987, pp 4811-4815 a gene expression system based on a member of the Alphavirus genus, viz. Sindbis virus, is disclosed which is used to express the bacterial CAT (chloramphenicol acetyltransferase) gene in avian cells, such as chicken embryo fibroblasts.

Xiong et al., Science, Vol 243, 1989, 1188-1191 also disclose a gene expression system based on Sindbis virus. This system is said to be efficient in a broad range of animal cells. Expression of the bacterial CAT gene in insect, avian and mammalian cells inclusive of human cells is disclosed therein.

Even though it is known from prior art that one member of the Alphavirus genus, the Sindbis virus, can tolerate insertion and direct the expression of at least one foreign gene, the bacterial chloramfenicol acetyl transferase (CAT) gene, it is evident from the results described that both systems described above are both ineffective in terms of exogenous gene expression and also very cumbersome to use. Hence, neither system has found any usage in the field of DNA expression in animal cells today.

In the first example a cDNA copy of a defective interfering (DI) virus variant of Sindbis virus was used to carry the CAT gene. RNA was transcribed in vitro and used to transfect avian cells and some CAT protein production could be demonstrated after infecting cells with wild-type Sindbis virus. The latter virus provided the viral replicase for expression of the CAT construct. The inefficiency of this system depends on 1) low level of initial DI-CAT RNA transfection (0.05-0.5% of cells) and 2) inefficient usage of the DI-CAT RNA for protein translation because of unnatural and suboptimal protein initiation translation signals. This same system also results in packaging of some of the recombinant DI-CAT genomes into virus particles. However, this occurs simultaneously with a very large excess of wild-type Sindbis virus production. Therefore, the usage of this mixed virus stock for CAT expression will be much hampered by the fact that most of the replication and translation activity of the cells infected with such a stock will deal with the wild-type and not with recombinant gene expression.

Much of the same problems are inherent to the other Sindbis expression system described. In this an RNA replication competent Sindbis DNA vector is used to carry the CAT gene. RNA produced in vitro is shown to replicate in animal cells and CAT activity is found. However, as only a very low number of cells are transfected the overall CAT production remains low. Another possible explanation for this is that the Sindbis construct used is not optimal for replication. Wild-type Sindbis virus can be used to rescue the recombinant genome into particles together with an excess of wild-type genomes and this mixed stock can then be used to express a CAT protein via infection. However, this stock has the same problems as described above for the recombinant DI system. The latter paper shows also that if virus is amplified by several passages increased titres of the recombinant virus particles can be obtained. However, one should remember that the titre of the wild-type virus will increase correspondingly and the original problem of mostly wild-type virus production remains. There are also several potential problems when using several passages to produce a mixed virus stock. As there is no selected pressure for preservation of the recombinant genomes these might easily 1) undergo rearrangements and 2) become outnumbered by wild-type genomes as a consequence of less efficient replication and/or packaging properties.

Another important aspect of viral DNA expression vectors is use thereof to express antigens of unrelated pathogens and thus they can be used as vaccines against such pathogens.

Development of safe and effective vaccines against viral diseases has proven to be quite a difficult task. Although many existing vaccines have helped to combat the worldwide spread of many infectious diseases, there is still a large number of infectious agents against which effective vaccines are missing. The current procedures of preparing vaccines present several problems: (1) it is often difficult to prepare sufficiently large amounts of antigenic material; (2) In many cases there is the additional hazard that the vaccine preparation is not killed or sufficiently attenuated; (3) Effective vaccines are often hard to produce since there is a major difficulty in presenting the antigenic epitope in an immunologically active form; (4) In the case of many viruses, genetic variations in the antigenic components results in the evolution of new strains with new serological specificities, which again creates a need for the development of new vaccines.

Two types of viral DNA vectors have been developed in order to overcome many of these problems in vaccine production. These either provide recombinant viruses or provide chimaeric viruses. The recombinant viruses contain a wild-type virus package around a recombinant genome. These particles can be used to infect cells which then produce the antigenic protein from the recombinant genome. The chimaeric viruses also contain a recombinant genome but this specifies the production of an antigen, usually as part of a normal virus structural protein, which then will be packaged in progeny particles and e.g. exposed on the surface of the viral spike proteins. The major advantages of these kind of virus preparations for the purpose of being used as a vaccine are 1) that they can be produced in large scale and 2) that they provide antigen in a natural form to the immunological system of the organism. Cells, which have been infected with recombinant viruses, will synthesize the exogenous antigen product, process it into peptides that then present them to T cells in the normal way. In the case of the chimaeric virus there is, in addition, an exposition of the antigen in the context of the subunits of the virus particle itself. Therefore, the chimaeric virus is also-called an epitope carrier.

The major difficulty with these kind of vaccine preparations are, how to ensure a safe and limited replication of the particles in the host without side effects. So far, some success has been obtained with vaccinia virus as an example of the recombinant virus approach (69) and of polio virus as an example of a chimaeric particle (70-72). As both virus variants are based on commonly used vaccine strains one might argue that they could be useful vaccine candidates also as recombinant respectively chimaeric particles (69-72). However, both virus vaccines are combined with the risk for side effects, even severe ones, and in addition these virus strains have already been used as vaccines in large parts of the population in many countries.

As is clear from the afore mentioned discussion there is much need to develop improved DNA expression systems both for an easy production of important proteins or polypeptides in high yields in various kinds of animal cells and for the production of recombinant viruses or chimaeric viruses to be used as safe and efficient vaccines against various pathogenes.

Thus, an object of the present invention is to provide an improved DNA expression system based on virus vectors which can be used both to produce proteins and polypeptides and as recombinant virus or chimaeric virus, which system offers many advantages over prior art.

To that end, according to the present invention there is provided an RNA molecule derived from an alphavirus RNA genome and capable of efficient infection of animal host cells, which RNA molecule comprises the complete alphavirus RNA genome regions, which are essential to replication of the said alphavirus RNA, and further comprises an exogenous RNA sequence capable of expressing its function in said host cell, said exogenous RNA sequence being inserted into a region of the RNA molecule which is non-essential to replication thereof.

Alphavirus is a genus belonging to the family Togaviridae having single stranded RNA genomes of positive polarity enclosed in a nucleocapsid surrounded by an envelope containing viral spike proteins.

The Alphavirus genus comprises among others the Sindbis virus, the Semliki Forest virus (SFV) and the Ross River virus, which are all closely related. According to a preferred embodiment of the invention, the Semliki Forest virus (SFV) is used as the basis of the DNA expression system.

The exogenous RNA sequence encodes a desired genetic trait, which is to be conferred on the virus or the host cell, and said sequence is usually complementary to a DNA or cDNA sequence encoding said genetic trait. Said DNA sequence may be comprised of an isolated natural gene, such as a bacterial or mammalian gene, or may constitute a synthetic DNA sequence coding for the desired genetic trait i.e. expression of a desired product, such as an enzyme, hormone, etc. or expression of a peptide sequence defining an exogenous antigenic epitope or determinant.

If the exogenous RNA sequence codes for a product, such as a protein or polypeptide, it is inserted into the viral RNA genome replacing deleted structural protein encoding region(s) thereof, whereas a viral epitope encoding RNA sequence may be inserted into structural protein encoding regions of the viral RNA genome, which essentially do not comprise deletions or only have a few nucleosides deleted.

The RNA molecule can be used per se, e.g. in solution to transform animal cells by conventional transfection, e.g. the DEAE-Dextran method or the calcium phosphate precipitation method. However, the rate of transformation of cells, and, thus the expression rate can be expected to increase substantially if the cells are transformed by infection with infectious viral particles. Thus, a suitable embodiment of the invention is related to an RNA virus expression vector comprising the RNA molecule of this invention packaged into infectious particles comprising the said RNA within the alphavirus nucleocapsid and surrounded by the membrane including the alphavirus spike proteins.

The RNA molecule of the present invention can be packaged into such particles without restraints provided that it has a total size corresponding to the wild type alphavirus RNA genome or deviating therefrom to an extent compatible with package of the said RNA into the said infectious particles.

These infectious particles, which include recombinant genomes packaged to produce a pure, high titre recombinant virus stock, provides a means for exogenous genes or DNA sequences to be expressed by normal virus particle infection, which as regards transformation degree, is much more efficient than RNA transfection.

According to a suitable embodiment of the invention such infectious particles are produced by cotransfection of animal host cells with the present RNA which lacks part of or the complete region(s) encoding the structural viral proteins together with a helper RNA molecule transcribed in vitro from a helper DNA vector comprising the SP6 promoter region, those 5' and 3' regions of the alphavirus cDNA which encode cis acting signals needed for RNA replication and the region encoding the viral structural proteins but lacking essentially all of the nonstructural virus proteins encoding regions including sequences encoding RNA signals for packaging of RNA into nucleocapsid particles, and culturing the host cells.

According to another aspect of the invention efficient introduction of the present RNA into animal host cells can be achieved by electroporation. For example, in the case of Baby Hamster Kidney (BHK) cells a transformation degree of almost 100% has been obtained for the introduction of an RNA transcript derived from SFV cDNA of the present invention. This makes it possible to reach so high levels of exogenous protein production in every cell that the proteins can be followed in total cell lysates without the need of prior concentration by antibody precipitation.

By electroporation, it is also possible to obtain a high degree of cotransfection in the above process for production of infectious particles comprising packaged RNA of the present invention. Essentially all animal cells will contain both the present RNA molecule and the helper RNA molecule, which leads to a very efficient trans complementation and formation of infectious particles. A pure recombinant virus stock, consisting of up to 10⁹ -10¹⁰ infectious particles, can be obtained from 5×10⁶ cotransfected cells after only a 24 h incubation. Furthermore, the so obtained virus stock is very safe to use, since it is comprised of viruses containing only the desired recombinant genome, which can infect host cells but can not produce new progeny virus.

Theoretically, a regeneration of a wild-type virus genome could take place when producing the recombinant virus in the contransfected cells. However, the possibility to avoid spread of such virus can be eliminated by incorporating a conditionally lethal mutation into the structural part of the helper genome. Such a mutation is described in the experimental part of this application. Thus, the virus produced with such a helper will be noninfectious if not treated in vitro under special conditions.

The technique of electroporation is well known within the field of biotechnology and optimal conditions can be established by the man skilled in the art. For instance, a BioRad Gene pulser apparatus (BioRad, Richmond, Calif., USA) can be used to perform said process.

The RNA molecule of the present invention is derived by in vivo or in vitro transcription of a cDNA clone, originally produced from an alphavirus RNA and comprising an inserted exogenous DNA fragment encoding a desired genetic trait.

Accordingly, the present invention is also related to a DNA expression vector comprising a full-length or partial cDNA complementary to alphavirus RNA or parts thereof and located immediately downstream of the SP6 RNA polymerase promoter and having a 5'ATGG, a 5'GATGG or any other 5' terminus and a TTTCCA₆₉ ACTAGT (SEQ ID NO.:25) or any other 3' terminus.

According to one aspect of the present invention portions of the viral cDNA are deleted, the deletions comprising the complete or part of the region(s) encoding the virus structural proteins, and the vector further comprises an integrated polylinker region, which may correspond to BamHI-SmaI-XmaI, inserted at a location which enables an exogenous DNA fragment encoding a foreign polypeptide or protein to be inserted into the vector cDNA for subsequent expression in an animal host cell.

According to another aspect of this invention, the vector is comprised of full-length cDNA wherein an exogenous DNA fragment encoding a foreign epitopic peptide sequence can be inserted into a region coding for the viral structural proteins.

It is appreciated that this cDNA clone with its exogenous DNA insert is very efficiently replicated after having been introduced into animal cells by transfection.

A very important aspect of the present invention is that it is applicable to a broad range of host cells of animal origin. These host cells can be selected from avian, mammalian, reptilian, amphibian, insect and fish cells. Illustrative of mammalian cells are human, monkey, hamster, mouse and porcine cells. Suitable avian cells are chicken cells, and as reptilian cells viper cells can be used. Cells from frogs and from mosquitoes and flies (Drosophila) are illustrative of amphibian and insect cells, respectively. A very efficient virus vector/host cell system according to the invention is based on SFV/BHK cells, which will be discussed more in detail further below.

However, even though a very important advantage of the present DNA expression vector is that it is very efficient in a broad variety of animal cells it can also be used in other eucaryotic cells and in procaryotic cells.

The present invention is also related to a method to produce transformed animal host cells comprising transfection of the cells with the present RNA molecule or with the present transcription vector comprised of cDNA and carrying an exogenous DNA fragment. According to a suitable embodiment of the invention, transfection is produced by the above mentioned electroporation method, a very high transfection rate being obtained.

A further suitable transformation process is based on infection of the animal host cells with the above mentioned infectious viral particles comprising the present RNA molecule.

The transformed cells of the present invention can be used for different purposes.

One important aspect of the invention is related to use of the present transformed cells to produce a polypeptide or a protein by culturing the transformed cells to express the exogenous RNA and subsequent isolation and purification of the product formed by said expression. The transformed cells can be produced by infection with the present viral particles comprising exogenous RNA encoding the polypeptide or protein as mentioned above, or by transfection with an RNA transcript obtained by in vitro transcription of the present DNA vector comprised of cDNA and carrying an exogenous DNA fragment coding for the polypeptide or the protein.

Another important aspect of the invention is related to use of the present transformed cells for the production of antigens comprised of chimaeric virus particles for use as immunizing component in vaccines or for immunization purposes for in vivo production of immunizing components for antisera production.

Accordingly, the present invention is also related to an antigen consisting of a chimaeric alphavirus having an exogenous epitopic peptide sequence inserted into its structural proteins.

Preferably, the chimaeric alphavirus is derived from SFV.

According to a suitable embodiment, the exogenous epitopic peptide sequence is comprised of an epitopic peptide sequence derived from a structural protein of a virus belonging to the immunodeficiency virus class inclusive of the human immunodeficiency virus types.

A further aspect of the invention is related to a vaccine preparation comprising the said antigen as immunizing component.

In said vaccine the chimaeric alphavirus is suitably attenuated by comprising mutations, such as the conditionally lethal SFV-mutation described before, amber (stop codon) or temperature sensitive mutations, in its genome.

For instance, if the chimaeric virus particles containing the aforementioned conditional lethal mutation in its structural proteins (a defect to undergo a certain proteolytical cleavage in host cell during morphogenesis) is used as a vaccine then such chimaeric virus particles are first activated by limited proteolytic treatment before being given to the organism so that may can infect recipient cells. New chimaeric particles will be formed in cells infected with the activated virus but these will again have the conditional lethal phenotype and further spread of infection is not possible.

The invention is also concerned with a method for the production of the present antigen comprising

a) in vitro transcription of the cDNA of the present DNA vector carrying an exogenous DNA fragment encoding the foreign epitopic peptide sequence and transfection of animal host cells with the produced RNA transcript, or

b) transfection of animal host cells with the said cDNA of the above step a), culturing the transfected cells and recovering the chimaeric alphavirus antigen. Preferably, transfection is produced by electroporation.

Still another aspect of the invention is to use a recombinant virus containing exogenous RNA encoding a polypeptide antigen for vaccination purpose or to produce antisera. In this case the recombinant virus or the conditionally lethal variant of it is used to infect cells in vivo and antigen production will take place in the infected cells and used for antigen presentation to the immunological system.

According to another embodiment of the invention, the present antigen is produced in an organism by using in vivo infection with the present infectious particles containing exogenous RNA encoding an exogenous epitopic peptide sequence.

BRIEF DESCRIPTION OF THE DRAWINGS

In the following, the present invention will be illustrated more in detail with reference to the Semliki Forest virus (SFV), which is representative for the alphaviruses. This description can be more fully understood in conjunction with the accompanying drawings in which:

FIG. 1 is a schematic view over the main assembly and disassembly events involved in the life cycle of the Semliki Forest virus, and also shows regulation of the activation of SFV entry functions by p62 cleavage and pH1.

FIG. 2 illustrates the use of translocation signals during synthesis of the structural proteins of SFV; top, the gene map of the 26S subgenomic RNA; middle, the process of membrane translocation of the p62, 6K and E1 proteins; small arrows on the lumenal side denote signal peptidase cleavages; at the bottom, the characteristics of the three signal peptides are listed;

FIG. 3 shows features that make SFV an excellent choice as an expression vector.

FIG. 4 A-D show the construction of full-length infectious clones of SFV; FIG. 4A shows a schematic restriction map of the SFV genome; primers used for initiating cDNA synthesis are indicated as arrows, and the cDNA inserts used to assemble the final clone are showed as bars; FIG. 4B shows plasmid pPLH211, i.e. the SP6 expression vector used as carrier for the full-length infectious clone of SFV. FIG. 4D shows the resulting plasmid pSP6-SFV4; FIG. 4D shows the structure of the SP6 promoter area of the SFV clone (SEQ ID NO:25); the solid bar indicates the SP6 promoter sequence, and the first nucleotide to be transcribed is marked by an asterisk; underlined regions denote authentic SFV sequences.

FIGS. 5A-5Q shows the complete nucleotide sequence of the pSP6-SFV4 RNA transcript as DNA (U=T) (SEQ ID NO:1) and underneath the DNA sequence, the amino acid sequence of the non-structural polyprotein and the structural polyprotein (SEQ ID NO:2).

FIG. 6 shows an SFV cDNA expression system for the production of virus after transfection of in vitro made RNA into cell.

FIGS. 7A-7C show the construction of the SFV expression vectors pSFV1-3 and of the Helper 1.

FIG. 8A-8C show the polylinker region of SFV vector plasmids pSFV2 and pSFV3 (SEQ ID NO:4,5 and 6); the position of the promoter for the subgenomic 26S RNA is boxed, and the first nucleotide to be transcribed is indicated by an asterisk;

FIG. 9 is a schematic presentation of in vivo packaging of pSFV1-dhfr RNA into infectious particles using helper trans complementation; (dhfr means dihydrofolate reductase)

FIG. 10 shows the use of trypsin to convert p62-containing noninfectious virus particles to infectious particles by cleavage of p62 to E2 and E3.

FIGS. 11A-11E show the expression of heterologous proteins in BHK cells upon RNA transfection by electroporation.

FIGS. 12A-12B show in its upper part sequences encompassing the major antigenic site of SFV and the in vitro made substitutions leading to a BamHI restriction endonuclease site (SEQ ID NO:7,8), sequences spanning the principal neutralizing domain of the HIV gp120 protein (SEQ ID NO:9,10), and the HIV domain inserted into the SFV carrier protein E2 as a BamHI oligonucleotide (SEQ ID NO:11,12); and its lower part is a schematic presentation of the SFV spike structure with blow-ups of domain 246-251 in either wild type or chimaeric form.

The alphavirus Semliki Forest virus (abbreviated SFV in the following text) has for some 20 years been used as model system in both virology and cell biology to study membrane biosynthesis, membrane structure and membrane function as well as protein-RNA interactions (4, 5). The major reason for the use of SFV as such a model is due to its simple structure and efficient replication.

With reference to FIG. 1-3, in the following the SFV and its replication are explained more in detail. In essential parts, this disclosure is true also for the other alphaviruses, such as the Sindbis virus, and many of the references cited in this connection are indeed directed to the Sindbis virus. SFV consists of an RNA-containing nucleocapsid and a surrounding membrane composed of a lipid bilayer and proteins, a regularly arranged icosahedral shell of a protein called C protein forming the capsid inside which the genomic RNA is packaged. The capsid is surrounded by-the lipid bilayer that contains three proteins called El, E2, and E3. These so-called envelope proteins are glycoproteins and their glycosylated portions are on the outside of the lipid bilayer, complexes of these proteins forming the "spikes" that can be seen in electron micrographs to project outward from the surface of the virus.

The SFV genome is a single-stranded 5'-capped and 3'-polyadenylated RNA molecule of 11422 nucleotides (6,7). It has positive polarity, i.e. it functions as an mRNA, and naked RNA is able to start an infection when introduced into the cytoplasm of a cell. Infection is initiated when the virus binds to protein receptors on the host cell plasma membrane, whereby the virions become selectively incorporated into "coated pits" on the surface of the plasma membrane, which invaginate to form coated vesicles inside the cell, whereafter said vesicles bearing endocytosed virions rapidly fuse with organelles called endosomes. From the endosome, the virus escapes into the cell cytosol as the bare nucleocapsid, the viral envelope remaining in the endosome. Thereafter, the nucleocapsid is "uncoated" and, thus, the genomic RNA is released. Referring now to FIG. 1, infection then proceeds with the translation of the 5' two-thirds of the genome into a polyprotein which by self-cleavage is processed to the four nonstructural proteins nsP1-4 (8). Protein nsP1 encodes a methyl transferase which is responsible for virus-specific capping activity as well as initiation of minus strand synthesis (9, 10); nsP2 is the protease that cleaves the polyprotein into its four subcomponents (11, 12); nsP3 is a phosphoprotein (13, 14) of as yet unknown function, and nsP4 contains the SFV RNA polymerase activity (15, 16). Once the nsP proteins have been synthesized they are responsible for the replication of the plus strand (42S) genome into full-length minus strands. These molecules then serve as templates for the production of new 42S genomic RNAs. They also serve as templates for the synthesis of subgenomic (26S) RNA. This 4073 nucleotides long RNA is colinear with the last one-third of the genome, and its synthesis is internally initiated at the 26S promoter on the 42S minus strands (17, 18).

The capsid and envelope proteins are synthesized in different compartments, and they follow separate pathways through the cytoplasm, viz. the envelope proteins are synthesized by membrane-bound ribosomes attached to the rough endoplasmic reticulum, and the capsid protein is synthesized by free ribosomes in the cytosol. However, the 26S RNA codes for all the structural proteins of the virus, and these are synthesized as a polyprotein precursor in the order C-E3-E2-6K-E1 (19). Once the capsid (C) protein has been synthesized it folds to act as a protease cleaving itself off the nascent chain (20, 21). The synthesized C proteins bind to the recently replicated genomic RNA to form new nucleocapsid structures in the cell cytoplasm.

The said cleavage reveals an N-terminal signal sequence in the nascent chain which is recognized by the signal recognition particle targeting the nascent chain--ribosome complex to the endoplasmic reticulum (ER) membrane (22, 23), where it is cotranslationally translocated and cleaved by signal peptidase to the three structural membrane proteins p62 (precursor form of E3/E2), 6K and E1 (24, 25). The translocational signals used during the synthesis of the structural proteins are illustrated in FIG. 2. The membrane proteins undergo extensive posttranslational modifications within the biosynthetic transport pathway of the cell. The p62 protein forms a heterodimer with E1 via its E3 domain in the endoplasmic reticulum (26). This dimer is transported out to the plasma membrane, where virus budding occurs through spike nucleocapsid interactions. At a very late (post-Golgi) stage of transport the p62 protein is cleaved to E3 and E2 (27), the forms that are found in mature virions. This cleavage activates the host cell binding function of the virion as well as the membrane fusion potential of El. The latter activity is expressed by a second, low-pH activation step after the virus enters the endosomes of a new host cell and is responsible for the release of the viral nucleocapsid into the cell cytoplasm (28-32). The mature virus particles contain one single copy of the RNA genome encapsidated within 180 copies of the capsid protein in a T=3 symmetry, and is surrounded by a lipid bilayer carrying 240 copies of the spike trimer protein consisting of E1+E2+E3 arranged in groups of three in a T=4 symmetry (33).

The SFV entry functions are activated and regulated by p62 cleavage and pH. More specifically, the p62-E1 heterodimers formed in the ER are acid resistant. When these heterodimers are transported to the plasma membrane via the Golgi complex the E1 fusogen cannot be activated in spite of the mildly acidic environment, since activation requires dissociation of the complex. As is illustrated in FIG. 1, the released virus particles contain E2E1 complexes. Since the association between E2 and E1 is sensitive to acidic pH, during entry of the virus into a host cell through endocytosis the acidic milieu of the endosome triggers the dissociation of the spike complex (E1 E2 E3) resulting in free E1. The latter can be activated for the catalysis of the fusion process between the viral and endosomal membranes in the infection process as disclosed above.

As indicated in the preceding parts of the disclosure, the alphavirus system, and especially the SFV system, has several unique features which are to advantage in DNA expression systems. These are summarized below with reference to FIG. 3.

1. Genome of positive polarity. The SFV RNA genome is of positive polarity, i.e. it functions directly as mRNA, and infectious RNA molecules can thus be obtained by transcription from a full-length cDNA copy of the genome.

2. Efficient replication. The infecting RNA molecule codes for its own RNA replicase, which in turn drives an efficient RNA replication. Indeed, SFV is one of the most efficiently replicating viruses known. Within a few hours up to 200.000 copies of the plus-RNAs are made in a single cell. Because of the abundance of these molecules practically all ribosomes of the infected cell will be enrolled in the synthesis of the virus encoded proteins, thus overtaking host protein synthesis (34), and pulse-labelling of infected cells results in almost exclusive labelling of viral proteins. During a normal infection 10⁵ new virus particles are produced from one single cell, which calculates to at least 10⁸ protein molecules encoded by the viral genome (5).

3. Cytoplasmic replication. SFV replication occurs in the cell cytoplasm, where the virus replicase transcribes and caps the subgenomes for production of the structural proteins (19). It would obviously be very valuable to include this feature in a cDNA expression system to eliminate the many problems that are encountered in the conventional "nuclear" DNA expression systems, such as mRNA splicing, limitations in transcription factors, problems with capping efficiency and mRNA transport.

4. Late onset of cytopathic effects. The cytopathic effects in the infected cells appear rather late during infection. Thus, there is an extensive time window from about 4 hours after infection to up to 24 hours after infection during which a very high expression level of the structural proteins is combined with negligible morphological change.

5. Broad host range. This phenomenon is probably a consequence of the normal life cycle which includes transmission through arthropod vectors-to wild rodents and birds in nature. Under laboratory conditions, SFV infects cultured mammalian, avian, reptilian and insect cells (35) (Xiong, et al, loc. cit.)

6. In nature SFV is of very low pathogenicity for humans. In addition, the stock virus produced in tissue culture cells is apparently apathogenic. By means of specific mutations it is possible to create conditionally lethal mutations of SFV, a feature that is of great use to uphold safety when mass production of virus stocks is necessary.

In the nucleotide and amino acid sequences the following abbreviations have been used in this specification:

Ala, alanine; 11e, isoleucine; leu, leucine; Met, methionine; Phe, phenylalanine; Pro, proline; Trp, tryptophan; Val, valine; Asn, asparagine; Cys, cysteine; Gln, glutamine; Gly, glycine; Ser, serine; Thr, threonine; Tys, tyrosine; Arg, arginine; His, histidine; Lys, lysine; Asp, aspartic acid; Glu, glutamic acid; A, adenine; C, cytosine; G, guanine; T, thymine; U, uracil.

The materials and the general methodology used in the following examples are disclosed below.

1. Materials. Most restriction enzymes, DNA Polymerase I, Klenow fragment, calf intestinal phosphatase, T4 DNA ligase and T4 Polynucleotide kinase were from Boehringer (Mannheim, FRG). SphI, StuI and Kpni together with RNase inhibitor (RNasin) and SP6 Polymerase were from Promega Biotec (Madison, Wis.). Sequenase (Modified T7 polymerase) was from United States Biochemical (Cleveland, Ohio). Proteinase K was from Merck (Darmstadt, FRG). Ribonucleotides, deoxyribonucleotides, dideoxyribonucleotides and the cap analogue m⁷ G(5')ppp(5')G were from Pharmacia (Sweden). Oligonucleotides were produced using an Applied Bio-systems synthesizer 380B followed by HPLC and NAP-5 (Pharmacia) purification. Spermidine, phenylmethylsulfonyl fluoride (PMSF), diethylpyrocarbonate (DEPC), bovine serum albumin (BSA), creatine phosphate and creatine phosphokinase were from Sigma (St. Louis, Mo.). Pansorbin was from CalBiochem (La Jolla, Calif.). Agarose was purchased from FMC BioProducts (Rockland, Me.), and acrylamide from BioRad (Richmond, Calif.). L- ³⁵ S!methionine and α- ³⁵ S!-dATP-α-S were from Amersham.

2. Virus growth and purification: BHK-21 cells were grown in BHK medium (Gibco Life Technologies, Inc., New York) supplemented with 5% fetal calf serum, 10% tryptose phosphate broth, 10 mM HEPES (N-2-hydroxyethylpiperazine-N'-2-ethanesulfonic acid) and 2 mM glutamine. 90% confluent monolayers were washed once with PBS and infected with SFV in MEM containing 0.2% bovine serum albumin (BSA), 10 mM HEPES and 2 mM glutamine at a multiplicity of 0.1. Twenty-four hours post infection (p.i.) the medium was collected and cell debris removed by centrifugation at 8,000 xg for 20 min at 4° C. The virus was pelleted from the medium by centrifugation at 26,000 rpm for 1.5 h in an SW28 rotor at 4° C. The virus was resuspended in TN containing 0.5 mM EDTA.

3. Metabolic labeling and immunoprecipitation. Confluent monolayers of BHK cells grown in MEM supplemented with 10 mM HEPES, 2 mM glutamine, 0.2% BSA, 100 IU/mol of penicillin and 100 μg/ml streptomycin, were infected at a multiplicity of 50 at 37° C. After 1 h p.i. the medium was replaced with fresh medium and growth continued for 3.5 h. The medium was removed and cells washed once with PBS and overlayed with methionine-free MEM containing 10 mM HEPES and 2 mM glutamine. After 30 min at 37° C. the medium was replaced with the same containing 100 μCi/ml of ³⁵ S!methionine (Amersham) and the plates incubated for 10 min at 37° C. The cells were washed twice with labeling medium containing 10× excess methionine and then incubated in same medium for various times. The plates were put on ice, cells washed once with ice-cold PBS and finally lysis buffer (1% NP-40-50 mM Tris-HCl, pH 7.6-150 mM NaCl2 mM EDTA) containing 10 μg/ml PMSF (phenylmethylsulfonyl fluoride) was added. Cells were scraped off the plates, and nuclei removed by centrifugation at 6,000 rpm for 5 min at 4° C. in an Eppendorf centrifuge. Immunoprecipitations of proteins was performed as described (31). Briefly, antibody was added to lysate and the mixture kept on ice for 30 min. Complexes were recovered by binding to Pansorbin for 30 min on ice. Complexes were washed once with low salt buffer, once with high salt buffer, and once with 10 mM Tris-HCl, pH 7.5, before heating with gel loading buffer. To precipitate immunoprecipitate particular proteins, SDS was added to 0.1% and the mixture heated to 95° C. for 2 min followed by addition of 10 volumes of lysis buffer. Antibodies employed for the immunoprecipitation are as follows. Anti-E1 8.139!, anti-E2 5.1! (36), and anti-C 12/2! (37) monoclonals have been described. The human transferrin receptor was precipitated with the monoclonal antibody OKT-9 in ascites fluid. This preparation was provided by Thomas Ebel at our laboratory using a corresponding hybridoma cell line obtained from ATCC (American Type Culture Collection) No CRL 8021. Polyclonal rabbit anti-mouse dhfr was a kind gift from E. Hurt (European Molecular Biology Laboratory, Heidelberg, FRG) and rabbit anti-lysozyme has been described (38).

4. Immunofluorescence. To perform indirect immunofluorescence, infected cell monolayers on glass coverslips were rinsed twice with phosphate-buffered saline (PBS) and fixed in -20° C. methanol for 6 min. After fixation, the methanol was removed and the coverslip washed 3 times with PBS. Unspecific antibody binding was blocked by incubation at room temperature with PBS containing 0.5% gelatin and 0.25% BSA. The blocking buffer was removed and replaced with same buffer containing primary antibody. After 30 min at room temperature the reaction was stopped by washing 3 times with PBS. Binding of secondary antibody (FITC-conjugated sheep anti-mouse BioSys, Compiegne, France!) was done as for the primary antibody. After 3 washes with PBS and one rinse with water the coverslip was allowed to dry before mounting in Moviol 4-88 (Hoechst, Frankfurt am Main, FRG) containing 2.5% DABCO (1,4-diazobicyclo- 2.2.2!-octane).

5. DNA procedures. Plasmids were grown in Escherichia coli DH5α (Bethesda Research Laboratories) recA endA1 gyrA96 thi1 hsdR17 supE44 relA1 Δ(lacZYA-argF)U169 φ80dlacZΔ(M15)!. All basic DNA procedures were done essentially as described (39). DNA fragments were isolated from agarose gels by the freeze-thaw method (40) including 3 volumes of phenol during the freezing step to increase yield and purity. Fragments were purified by benzoyl-naphthoyl-DEAE (BND) cellulose (Serva Feinbiochemica, Heidelberg, FRG) chromatography (41). Plasmids used for production of infectious RNA were purified by sedimentation through 1M NaCl followed by banding in CsCl (39). In some cases plasmids were purified by Qiagen chromatography (Qiagen Gmbh, Dusseldorf, FRG).

6. Site-directed oligonucleotide mutagenesis. For oligonucleotide mutagenesis, relevant fragments of the SFV cDNA clone were subcloned into M13mp18 or mp 19 (42) and transformed (43) into DH5αFIQ endA1 hsdR1 supE44 thi1 recA1 gyrA96 relA1 φ80dlacΔ(M15) Δ(lacZYA-argF)U169/F'proAB lacl^(q) lacZΔ(M15) Tn 5! (Bethesda Research Laboratories). RF DNA from these constructs was transformed into RZ1032 (44) Hfr KL16 dut1 ung1 thi1 relA1 supE44 zbd279:Tn10!, and virus grown in the presence of uridine to incorporate uracil residues into the viral genome. Single stranded DNA was isolated by phenol extraction from PEG precipitated phage. Oligonucleotides were synthesized on an Applied Biosystems 380B synthesizer and purified by gel filtration over NAP-5 columns (Pharmacia). The oligonucleotides 5'-CGGCCAGTGAATTCTGATTGGATCCCGGGTAATTAATTGAATTACATCCCTACGCAAACG, (SEQ ID NO.:13) 5'-GCGCACTATTATAGCACCGGCTCCCGGGTAATTAATTGACGCAAACGTTTTACGGCCGCCGG (SEQ ID NO.:14) and 5'-GCGCACTATTATAGCACCATGGATCCGGGTAATTAATTGACGTTTTACGGCCGCCGGTGGCG (SEQ ID NO.:15) were used to insert the new linker sites BamHI-SmaI-XmaI! into the SFV cDNA clone. The oligonucleotides 5'-CGGCGGTCCTAGATTGGTGCG (SEQ ID NO.:16) and 5'-CGCGGGCGCCACCGGCGGCCG (SEQ ID NO.:17) were used as sequencing primers (SP1 and SP2) up- and downstream of the polylinker site. Phosphorylated oligonucleotides were used in mutagenesis with Sequenase (Unites States Biochemicals, Cleveland, Ohio) as described earlier (44, 45). In vitro made RF forms were transformed into DH5αF'IQ and the resulting phage isolates analyzed for the presence of correct mutations by dideoxy sequencing according to the USB protocol for using Sequenase. Finally, mutant fragments were reinserted into the full-length SFV cDNA clone. Again, the presence of the appropriate mutations was verified by sequencing from the plasmid DNA. Deletion of the 6K region has been described elsewhere.

7. In vitro transcription. SpeI linearized plasmid DNA was used as template for in vitro transcription. RNA was synthesized at 37° C. for 1 h in 10-50 μl reactions containing 40 mM Tris-HCl (pH 7.6), 6 mM spermidine-HCl, 5 mM dithiothreitol (DTT), 100 μg/ml of nuclease free BSA, 1 mM each of ATP, CTP and UTP, 500 μM of GTP, 1 unit/μl of RNasin and 100-500 units/ml of SP6 RNA polymerase. For production of capped transcripts (46), the analogs m⁷ G(5')ppp(5')G or m⁷ G(5')ppp(5')A were included in the reaction at 1 mM. For quantitation of RNA production, trace amounts of α-³² P!-UTP (Amersham) were included in the reactions and incorporation measured from trichloroacetic acid precipitates. When required, DNA or RNA was digested at 37° C. for 10 min by adding DNase 1 or RNase A at 10 units/μg template or 20 μg/ml respectively.

8. RNA transfection. Transfection of BHK monolayer cells by the DEAE-Dextran method was done as described previously (47). For transfection by electroporation, RNA was added either directly from the in vitro transcription reaction or diluted with transcription buffer containing 5 mM DTT and 1 unit/μl of RNasin. Cells were trypsinized, washed once with complete BHK-cell medium and once with ice-cold PBS (without MgCl₂ and CaCl₂) and finally resuspended in PBS to give 10⁷ cells/ml. Cells were either used directly or stored (in BHK medium) on ice over night. For electroporation, 0.5 ml of cells were transferred to a 0.2 cm cuvette (BioRad), 10-50 μl of RNA solution added and the solution mixed by inverting the cuvette. Electroporation was performed at room temperature by two consecutive pulses at 1.5 kV/25 μF using a BioRad Gene Pulser apparatus with its pulse controller unit set at maximum resistance. After incubation for 10 min, the cells were diluted 1:20 in complete BHK-cell medium and transferred onto tissue culture plates. For plaque assays, the electroporated cells were plated together with about 3×10⁵ fresh cells per ml and incubated at 37° C. for 2 h, then overlayed with 1.8% low melting point agarose in complete BHK-cell medium. After incubation at 37° C. for 48 h, plaques were visualized by staining with neutral red.

9. Gel electrophoresis. Samples for Sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) were prepared and run on 12% separating gels with a 5% stacking gel as previously described (48). For resolving the 6K peptide, a 10%-20% linear acrylamide gradient gel was used. Gels were fixed in 10% acetic acid-30% methanol for 30 min before exposing to Kodak XAR-5 film. When a gel was prepared for fluorography (49), it was washed after fixation for 30 min in 30% methanol and then soaked in 1M sodium salicylate-30% methanol for 30 min before drying. Nucleic acids were run on agarose gels using 50 mM Tris-borate-2.5 mM Na₂ EDTA as buffer. For staining 0.2 μg/ml of ethidium bromide was included in the buffer and gel during the run.

EXAMPLE 1

In this example a full-length SFV cDNA clone is prepared and placed in a plasmid containing the SP6 RNA polymerase promoter to allow in vitro transcription of full-length and infectious transcripts. This plasmid which is designated pSP6-SFV4 has been deposited on 28 Nov. 1991 at PHLS Centre for Applied Microbiology & Research European Collection of Animal Cell Cultures, Porton Down, Salisbury, Wiltshire, U.K:, and given the provisional accession number 91112826.

As illustrated in FIG. 4A-C the strategy for construction the SFV clone was to prime cDNA synthesis on several positions along the template RNA downstream of suitable restriction endonuclease sites defined by the known nucleotide sequence of the SFV RNA molecule. Virus RNA was isolated by phenol-chloroform extraction from purified virus (obtainable among others from the Arbovirus collection in Yale University, New Haven, USA) and used as template for cDNA synthesis as previously described (50). First strand synthesis was primed at three positions, using 5'-TTTCTCGTAGTTCTCCTCGTC (SEQ ID NO.:18) as primer-1 (SFV coordinate 2042-2062) and 5'-GTTATCCCAGTGGTTGTTCTCGTAATA (SEQ ID NO.:19) as primer-2 (SFV coordinate 3323-3349) and an oligo-dT₁₂₋₁₈ as primer -3 (3' end of SFV) FIG. 4A).

Second strand synthesis was preceded by hybridization of the oligonucleotide 5'-ATGGCGGATGTGTGACATACACGACGCC (SEQ ID NO.:20) identical to the 28 first bases of the genome sequence of SFV) to the first strand cDNA. After completion of second strand synthesis cDNA was trimmed and in all cases except in the case of the primer-1 reaction, the double-stranded adaptor 5'-AATTCAAGCTTGCGGCCGCACTAGT/GTTCGAACGCCGGCGTGATCA-3' (SEQ. ID. NO.21) (5'-sticky-EcoRi-HindIII-NotI-XmaIII-SpeI-blunt-3') was added and the cDNA cloned into EcoRl cleaved pTZ18R (Pharmacia, Sweden) as described (51). The cloning of the 5' end region was done in a different way. Since SFV contains a HindIII site at position 1947, cDNA primed with primer-1 should contain this area and therefore HindIII could be used to define the 3' end of that cDNA. To obtain a restriction site at the very 5' end of the SFV, cDNA was cloned into SmaI-HindIII cut pGEM1 (Promega Biotec., Madison, Wis.). Since the SFV genome starts with the sequence 5'-ATGG, ligation of this onto the blunt CCC-3' end of the SmaI site created an NcoI site C'CATGG. Although the SFV sequence contains 3 NcoI sites, none of these are within the region preceding the HindIII site, and thus these 5' end clones could be further subcloned as NcoI-HindIII fragments into a vector especially designed for this purpose (see below). The original cDNA clones in pGEM1 were screened by restriction analysis and all containing inserts bigger than 1500 bp were selected for further characterization by sequencing directly from the plasmid into both ends of the insert, using SP6 or T7 sequencing primers. The SFV 5'-end clones in pTZ18R were sequenced using lac sequencing primers. To drive in vitro synthesis of SFV RNA the SP6 promoter was used. Cloning of the SFV 5' end in front of this promoter without adding too many foreign nucleotides required that a derivative of pGEM1 had to be constructed. Hence, pGEM1 was opened at EcoRl and Ba131 deletions were created, the DNA blunted with T4 DNA polymerase and an Ncol oligonucleotide (5'-GCCATGGC, (SEQ ID NO.:22) added. The clones obtained were screened by colony hybridization (39) with the oligonucleotide 5'-GGTGACACTATAGCCATGGC (SEQ ID NO.:23) designed to pick up (at suitable stringency) the variants that had the NcoI sequence immediately at the transcription initiation site of the SP6 promoter (G underlined). Since the Bal31 deletion had removed all restriction sites of the multicloning site of the original plasmid, these were restored by cloning a PvuI-NcoI fragment from the new variant into another variant of pGEM1 (pDH101) that had an NcoI site inserted at its HindIII position in the polylinker. This created the plasmid pDH201. Finally, the adaptor used for cloning the SFV cDNA was inserted into pDH201 between the EcoRI and PvuII sites to create plasmid pPLH211 (FIG. 4B). This plasmid was then used as recipient for SFV cDNA fragments in the assembly of the full-length clone by combining independent overlapping subclones using these sites. The fragments and the relevant restriction sites used to assemble the full-length clone, pSP6-SFV4, are depicted in (FIG. 4A). For the 5'-end, the selected fragment contained the proper SFV sequence 5'-ATGG, with one additional G-residue in front. When this G-residue was removed it reduced transcription efficiency from SP6 but did not affect infectivity of the in vitro made RNA. Thus, the clone used for all subsequent work contains the G-residue at the 5' end. For the 3'-end of the clone, a cDNA fragment containing 69 A-residues was selected. By inclusion of the unique SpeI site at the 3'-end of the cDNa, the plasmid can be linearized to allow for runoff transcription in vitro giving RNA-carrying 70 A-residues. FIG. 4C shows the 5' and 3' border sequences of the SFV cDNA clone. The general outline how to obtain and demonstrate infectivity of the full-length SFV RNA is depicted in FIG. 6. The complete nucleotide sequence of the pSP6-SFV4 SP6 transcript together with the amino acid sequences of the nonstructural and the structural polyproteins is shown in FIG. 5.

Typically, about 5 μg of RNA per 100 ng of template was obtained using 10 units of polymerase, but the yield could be increased considerably by the use of more enzyme. The conditions slightly differ from those reported earlier for the production of infectious transcripts of alphaviruses (52) (47). A maximum production of RNA was obtained with rNTP concentrations at 1 mM. However, since infectivity also is dependent on the presence of a 5' cap structure optimal infectivity was obtained when the GTP concentration in the transcription reaction was halved. This drop had only a marginal effect on the amounts of RNA produced but raised the specific infectivity by a factor of 3 (data not shown).

The cDNA sequence shown in FIGS. 5A-5Q has been used in the following examples. However, sequences having one or a few nucleotides, which differ from those shown in FIGS. 5A-5Q, could also be useful as vectors, even if these might be less efficient as illustrated above with the SFV cDNA sequence lacking the first 5'-G nucleotide in FIG. 5A.

EXAMPLE 2

In this example the construction of SFV DNA expression vectors is disclosed.

The cDNA clone coding for the complete genome of SFV obtained in Example 1 was used to construct a SFV DNA expression vector by deletion of the coding region of the 26S structural genes to make way for heterologous inserts. However, the nonstructural coding region, which is required for the production of the nsP1-4 replicase complex is preserved. RNA replication is dependent on short 5' (nt 1-247) (53, 54, 55) and 3' (nt 11423-11441) sequence elements (56, 57), and therefore, also these had to be included in the vector construct, as had the 26S promoter just upstream of the C gene (17, 18).

As is shown in FIG. 7A-7C, first, the XbaI (6640)-NsiI (8927) fragment from the SFV cDNA clone pSP6-SFV4 from Example 1 was cloned into pGEM7Zf(+) (Promega Corp., Wis., USA) (Step A). From the resulting plasmid, pGEM7Zf(+)-SFV, the EcoRI fragment (SFV coordinates 7391 and 88746) was cloned into M13mp19 to insert a BamHI-XmaI-SmaI polylinker sequence immediately downstream from the 26S promoter site using site-directed mutagenesis (step B). Once the correct mutants had been verified by sequencing from M13 ssDNA (single stranded), the EcoRI fragments were reinserted into pGEM7Zf(+)-SFV (step C) and then cloned back as XbaI-Nsλ fragments into pSP6-SFV4 (step D). To delete the major part of the cDNA region coding for the structural proteins of SFV, these plasmids were then cut with AsuII (7783) and NdeI (11033), blunted using Klenow fragment in the presence of all four nucleotides, and religated to create the final vectors designated pSFV1, pSFV2 and pSFV3, respectively (step E). The vectors retain the promoter region of the 26S subgenomic RNA and the last 49 amino acids of the E1 protein as well as the complete non-coding 3' end of the SFV genome.

In the vectors the subgenomic (26S) protein coding portion has been replaced with a polylinker sequence allowing the insertional cloning of foreign cDNA sequences under the 26S promoter. As is shown in FIG. 8 these three vectors have the same basic cassette inserted downstream from the 26S promoter, i.e. a polylinker (BamHI-SmaI-XmaI) followed by a translational stop codon in all three reading frames. The vectors differ as to the position where the polylinker cassette has been inserted. In pSFV1 the cassette is situated 31 bases downstream of the 26S transcription initiation site. The initiation motif of the capsid gene translation is identical to the consensus sequence (58). Therefore, this motif has been provided for in pSFV2, where it is placed immediately after the motif of the capsid gene. Finally, pSFV3 has the cassette placed immediately after the initiation codon (AUG) of the capsid gene. Sequencing primers (SP) needed for checking both ends of an insert have been designed to hybridize either to the 26S promoter region (SP1), or to the region following the stop codon cassette (SP2).

Note that the 26S promoter overlaps with the 3'-end of the nsP4 coding region. For pSFV2, the cloning site is positioned immediately after the translation initiation site of the SFV capsid gene. For pSFV3, the cloning site is positioned three nucleotides further downstream, i.e. immediately following to the initial AUG codon of the SFV capsid gene. The three translation stop codons following the polylinker are boxed. The downstream sequencing primer (SP1) overlaps with the 26S promoter, and the upstream sequencing primer (Sp2) overlaps the XmaIII site.

EXAMPLE 3

In this example an in vivo packaging system encompassing helper virus vector constructs is prepared.

The system allows SFV variants defective in structural protein functions, or recombinant RNAs derived from the expression vector construct obtained in Example 2, to be packaged into infectious virus particles. Thus, this system allows recombinant RNAs to be introduced into cells by normal infection. The helper vector, called pSFV-Helper1, is constructed by deleting the region between the restriction endonuclease sites AccI (308) and AccI (6399) of pSP6-SFV4 obtained in Example 1 by cutting and religation as shown in FIG. 7B, step F. The vector retains the 5' and 3' signals needed for RNA replication. Since almost the complete nsP region of the Helper vector is deleted, RNA produced from this construct will not replicate in the cell due to the lack of a functional replicase complex. As is shown in FIG. 9, after transcription in vitro of pSFV1-recombinant and helper cDNAs, helper RNA is cotransfected with the pSFV1 - recombinant derivative, the helper construct providing the structural proteins needed to assemble new virus particles, and the recombinant providing the nonstructural proteins needed for RNA replication, SFV particles comprising recombinant genomes being produced. The cotransfection is preferably produced by electroporation as is disclosed in Example 6 and preferably BHK cells are used as host cells.

To package the RNA a region at the end of nsP1 is required, ah area which has been shown to bind capsid protein (57, 59). Since the Helper lacks this region, RNA derived from this vector will not be packaged and hence, transfections with recombinant and Helper produces only virus particles that carry recombinant-derived RNA. It follows that these viruses cannot be passaged further and thus provide a one-step virus stock. The advantage is that infections with these particles will not produce any viral structural proteins.

EXAMPLE 4

This example illustrates the construction of variants of the full-length SFV cDNA clone from Example 1 that allow insertion of foreign DNA sequences encoding foreign epitopes, and the production of recombinant (chimaeric) virus carrying said foreign epitopes as integral parts of the p62, E2 or E1 spike proteins.

To this end, a thorough knowledge of the function, topology and antigenic structure of the E2 and E1 envelope proteins has been of the essence. Earlier studies on the pathogenicity of alphaviruses have shown that antibodies against E2 are type-specific and have good neutralizing activity while those against E1 generally are group-specific and are nonneutralizing (5). However, not until recently have antigenic sites of the closely related alphaviruses SFV, Sindbis, and Ross River been mapped and correlated to the level of amino acid sequence (60, 61, 62, 63). These studies have shown that the most dominant sites in question are at amino acid positions 216, 234 and 246-251 of the SFV E2 spike protein. Interestingly, these three sites are exactly the same as the ones predicted by computer analysis. In the present example domain 246-251 was used, since this area has a highly conserved structure and hydropathy profile within the group of alpha-viruses. Insertion of a gene encoding a foreign epitope into the 246-251 region of the pSP6-SFV4 p62 protein yields particles with one new epitope on each heterodimer, i.e. 240 copies.

To create a unique restriction endonuclease site that would allow specific insertion of foreign epitopes into the E2 portion of the SFV genome, a BamHI site was inserted by site directed mutagenesis using the oligonucleotide 5'-GATCGGCCTAGGAGCCGAGAGCCC-3, (SEQ ID NO.:24)

EXAMPLE 5

In this example a conditionally lethal variant of SFV is constructed from the SFV cDNA obtained in Example 1, which variant carries a mutation in the p62 protein resulting in a noncleavable from of said protein, with the result that this variant as such cannot infect new host cells, unless first cleaved with exogenously added protease.

As illustrated in FIG. 10, this construct can be advantageously used as a vaccine carrier for foreign epitopes, since this form of the virus cannot enter new host cells although assembled with wild type efficiency in transfected cells. The block can be overcome by trypsin treatment of inactive virus particles. This converts the particle into a fully entry-competent form which can be used for amplification of this virus variant stock.

Once activated the SFV variant will enter cells normally through the endocytic pathway and start infection. Viral proteins will be made and budding takes place at the plasma membrane. However, all virus particles produced will be of inactive form and the infection will thus cease after one round of infection. The reason for the block in infection proficiency is a mutation which has been introduced by site directed mutagenesis into the cleavage site of p62. This arginine to leucine substitution (at amino acid position 66 of the E3 portion of the p62 protein) changes the consensus features of the cleavage site so that it will not be recognized by the host cell proteinase that normally cleaves the p62 protein to the E2 and E3 polypeptides during transport to the cell surface. Instead, only exogenously added trypsin will be able to perform this cleavage, which in this case occurs at the arginine residue 65 immediately preceding the original cleavage site. As this cleavage regulates the activation of the entry function potential of the virus by controlling the binding of the entry spike subunit, the virus particle carrying only uncleaved p62 will be completely unable to enter new host cells.

The creation of the cleavage deficient mutation E2 has been described earlier (29). An Asull-Nsλ fragment spanning this region was then isolated and cloned into the full-length cDNA clonepSP6-SFV4.

EXAMPLE 6

In this example transfection of BHK cells with SFV RNA molecules transcribed in vitro from full-length cDNA from Example 1 or variants thereof or the SFV vectors from Example 2, which comprise exogenous DNA, is disclosed. The transfection is carried out by electroporation which is shown to be very efficient at optimized conditions.

BHK cells were transfected with the above SFV RNA molecules by electroporation and optimal conditions were determined by varying parameters like temperature, voltage, capacitance, and number of pulses. Optimal transfection was obtained by 2 consecutive pulses of 1.5 kV at 25 μF, under which negligible amounts of cells were killed. It was found that it was better to keep the cells at room temperature than at 0° C. during the whole procedure. Transfection by electroporation was also measured as a function of input RNA. As expected, an increase in transfection frequency was not linearly dependent on RNA concentration, and about 2 μg of cRNA were needed to obtain 100% transfection.

On comparison with conventional transfection, this is a great improvement. For example, with DEAE-Dextran transfection optimally, only 0.2% of the cells were transfected.

EXAMPLE 7

This example illustrates heterologous gene expression driven by the SFV vector, pSFV1 from Example 2, for genes encoding the 21 kD cytoplasmic mouse dihydrofolate reductase (dhfr), the 90 kD membrane protein human transferrin receptor (TR), and finally the 14 kD secretory protein chicken lysozyme. The dhfr gene was isolated from pGEM2-dhfr (64) as a BamHI-HindIII fragment blunted with Klenow fragment and inserted into SmaI-cut pSFV1. The transferrin receptor gene was first cloned from pGEM1-TR (64, 65) as an XbaI-EcoRI fragment into pGEM7ZF(+) and subsequently from there as a BamHI fragment into pSFV1. Finally, a BamHI fragment from pGEM2 carrying the lysozyme gene (21) was cloned into pSFV1.

To study the expression of the heterologous proteins, in vitro-made RNA of the dhfr and TR constructs was electroporated into BHK cells. RNA of wild type SFV was used as control. At different time points post electroporation (p.e.) cells were pulse-labeled for 10 min followed by a 10 min chase, whereafter the lysates were analyzed by gel electrophoresis and autoradiography. The results are shown in FIGS. 11A-11E. More specifically, BHK cells were transfected with RNAs of wild type SPV, and pSFV1-dhfr, and pSFV1-TR, pulse-labeled at 3, 6, 9, 12, 15 and 24 h p.e. Equal amounts of lysate were run on a 12% gel. The 9 h sample was also used in immunoprecipitation (IP) of the SFV, the dhfr and the transferrin receptor proteins. Cells transfected with pSFV1-lysozyme were pulse-labeled at 9 h p.e. and then chased for the times (hours) indicated. An equal portion of lysate or medium was loaded on the 13.5% gel. IP represents immunoprecipitation from the 1 h chase lysate sample. The U-lane is lysate of labeled but untransfected cells. At 3 h p.e. hardly any exogenous proteins were made, since the incoming RNA starts with minus strand synthesis which does not peak until about 4-5 h p.e. (5). At this time point, almost all labeled proteins were of host origin. In contrast, at 6 h p.e. the exogenous proteins were synthesized with great efficiency, and severe inhibition of host protein synthesis was evident. This was even more striking at 9 h p.e., when maximum shut down of host protein synthesis had been reached. Efficient production of the heterologous proteins continued up to 24 h p.e., after which production slowed down (data not shown), indicating that the cells had entered a stationary phase.

Since chicken lysozyme is a secretory protein, its expression was analyzed both from cell lysates and from the growth medium. Cells were pulse-labeled at 9 h p.e. and then chased up to 8 h. The results are shown in FIG. 11A-D. FIG. 11A shows the result of expression of wild-type viral proteins. FIG 11B shows the expression of human transferrin receptor. FIG 11C shows the expression of mouse dihydrofolate reductase. FIG 11D shows the expression of chicken lysozyme. Although lysozyme was slowly secreted, almost all labeled material was secreted to the medium during the chase.

EXAMPLE 8

This example illustrates the present in vivo packaging system.

In vitro-made RNA of pSFV1-TR was mixed with Helper RNA at different ratios and these mixtures were co-transfected into BHK cells. Cells were grown for 24 h after which the culture medium was collected and the virus particles pelleted by ultracentrifugation. The number of infectious units (i.u.) was determined by immunofluorescence. It was found that a 1:1 ratio of Helper and recombinant most efficiently produced infectious particles, and on the average 5×10⁶ cells yielded 2.5×10⁹ i.u. The infectivity of the virus stock was tested by infecting BHK cells at different multiplicities of infection (m.o.i.). In FIG. 11E the results for expression of human transferrin receptor in BHK cells after infection by such in vivo packaged particles carrying pSFV1-TR recombinant RNA is shown to the lower right. 200 μl of virus diluted in MEM (including 0,5% BAS and 2 mM glutamine) was overlaid on cells to give m.o.i. values ranging from 5 to 0.005. After 1 h at 37° C., complete BHK medium was added and growth continued for 9 h, at which time a 10 min pulse (100 μCi ³⁵ S-methionine/ml) and 10 min chase was performed, and the cells dissolved in lysis buffer. 10 μl out of the 300 μl lysate (corresponding to 30,000 cells) was run on the 10% gel, and the dried gel was exposed for 2 h at -70° C. Due to the high expression level, only 3,000 cells are needed to obtain a distinct band on the autoradiograph with an overnight exposure.

Thus, it was found that efficient protein production and concomitant host protein shut-off occurred at about 1 i.u. per cell. Since one SFV infected cell produces on the average 10⁸ capsid protein molecules, it follows that a virus stock produced from a single electroporation can be used to produce 10¹⁷ protein molecules equaling about 50 mg of protein.

From the foregoing experimental results it is obvious that the present invention is related to very useful and efficient expression system which lacks several of the disadvantages of the hitherto existing expression system. The major advantages of the present system are shortly summarized as follows:

(1) High titre recombinant virus stocks can be produced in one day by one transfection experiment. There is no need for selection/screening, plaque purification and amplification steps. This is appreciated since an easy production of recombinant virus is especially important in experiments where the phenotypes of large series of mutants have to be characterized.

(2) The recombinant virus stock is free from helper virus since only the recombinant genome but not the helper genome contains a packaging signal.

(3) The recombinant virus can be used to infect the recombinant genome in a "natural" and non-leaky way into a large variety of cells including insect and most higher eucaryotic cell types. Such a wide host range is very useful for an expressions system especially when cell-type-specific posttranslational modification reactions are required for the activity of the expressed protein.

(4) The level of protein expression obtained is extremely high, the level corresponding to those of the viral proteins during infection. There is also a host cell protein shut-off which makes it possible to follow the foreign proteins clearly in cell lysates without the need for antibody mediated antigen concentration. This will facilitate DNA expression experiments in cell biology considerably. Furthermore, problems of interference by the endogenous counter part to an expressed protein (i.e. homo-oligomerization reactions) can be avoided.

EXAMPLE 9

This example illustrates epitope carriers. A very important example where vaccine development is of the utmost importance concerns the acquired immunodeficiency syndrome (AIDS) caused by the human immunodeficiency virus HIV-1 (66, 67). So far, all attempts to produce an efficient vaccine against HIV-1 have failed, although there was a very recent report that vaccination with disrupted SIV-1 (Simian immunodeficiency virus) to a certain extent may give protection against infections of that virus (68). However, development of safe and effective vaccine against HIV-1 will be very difficult due to the biological properties of the virus. In the present example one epitope of HIV-1 was inserted into an antigenic domain of the E2 protein of SFV. The epitope used is located in glycoprotein gp120 of HIV-1, spanning amino acids 309-325. This forms the variable loop of HIV-1 and is situated immediately after an N-glycosylated site.

A chimaera was constructed where the 309-325 epitope of HIV was inserted into the BamHI site using cassette insertion of ready-made oligonucleotides encoding the HIV epitope. The required base substitutions at the BamHI site did not lead to any amino acid changes in the vector, although two amino acids (Asp and Glu) changed places. This change did not have any deleterious effect since in vitro made vector RNA induced cell infection with wild type efficiency. FIG. 12A shows the sequences in the area of interest in the epitope carrier. In preliminary experiments, it has been shown that chimaeric proteins were produced. The proteins can be immunoprecipitated with anti-HIV antibodies. It is to be expected that these are also used for production of chimaeric virus particles that can be used for vaccine preparation against HIV. Such particles are shown in FIG. 12B.

List of references

1) Bishop, D. H. L. (1990). Gene expression using insect cells and viruses. In current Opinion in Biotechnology, Vol. 1, Rosenberg, M., and Moss, B., eds. (London: Current Opinion Ltd.), pp. 62-67.

2) Moss, B. (1990). Regulation of Vaccinia virus transcription. Ann. Rev. Biochem. 59, 661-688.

3) Moss, B, and Flexner, C. (1989). Vaccinia virus expression vectors. Ann. N.Y. Acad Sci. 569, 86-103.

4) Garoff, H., Kondor-Koch, C., and Riedel, H. (1982). Structure and assembly of alphaviruses. Curr. Top. Microbiol. Immunol. 99, 1-50.

5) Strauss, E. G., and Strauss, J. H. (1986). Structure and replication of the alphavirus genome. In The Togaviridae and Flaviviridae, Vol. Schlesinger, S. S., and Schlesinger, M. J., eds (New York: Plenum Press), pp. 35-90.

6) Garoff, H., Frischauf, A.-M., Simons, K., Lehrach, H, and Delius, H. (1980). Nucleotide sequence of cDNA coding for Semliki Forest virus membrane glycoproteins. Nature 288, 236-241.

7) Takkinen, K. (1986). Complete nucleotide sequence of the nonstructural protein genes of Semliki forest virus. Nucl. Acids Res. 14, 5667-5682.

8) de Groot, R. J., Hardy, W. R., Shirako, Y., and Strauss, J. H. (1990). Cleavage-site preferences of Sindbis virus polyproteins containing the non-structural proteinase. Evidence for temporal regulation of polyprotein processing in vivo. EMBO J. 9, 2631-2638.

9) Hahn, Y. S., Strauss, E. G., and Strauss, J. H. (1989b). Mapping of RNA-temperature-sensitive mutants of Sindbis virus: assignment of complementation groups A, B, and G to nonstructural proteins. J. Virol. 63, 3142-3150.

10) Mi, S., Durbin, R., Huang, H. V., Rice, C. M., and Stollar, V. (1989). Association of the Sindbis virus RNA methyltransferase activity with the nonstructural protein nsP1. Virology 170, 385-391.

11) Ding, M., and Schlesinger, M. J. (1989). Evidence that Sindbis virus nsP2 is an autoprotease which processes the virus non-structural polyprotein. Virology 171, 280-284.

12) Hardy, W. R., and Strauss, J. H. (1989). Processing the nonstructural polyproteins of Sindbis virus: nonstructural proteinase is in the C-terminal half of nsP2 and functions both in cis and in trans. J. Virol. 63, 4653-4664.

13) Li, G., La Starza, M. W., Hardy, W. R., Strauss, J. H., and Rice, C. M. (1990). Phosphorylation of Sindbis virus nsP3 in vivo and in vitro.

14) Peranen, J., Takkinen, K., Kalkkinen, N., and Kaariainen, L. (1988). Semliki Forest virus-specific nonstructural protein nsP3 is a phosphoprotein. J. Gen. Virol. 69, 2165-2178.

15) Hahn, Y. S., Grakoui, A., Rice, C. M., Strauss, E. G., and Strauss, J. H. (1989a). Mapping of RNA-temperature-sensitive mutants of Sindbis virus: complementation group F mutants have lesions in nsP4.

16) Sawicki, D. L., Barkhimer, D. B. Sawicki, S. G., Rice, C. M., and Schlesinger, S. (1990). Temperature sensitive shut-off of alphavirus minus strand RNA synthesis maps to a nonstructural protein, nsP4. Virology 174, 43-52.

17) Grakoui, A., Levis, R., Raju, R., Huang, H. V., and Rice, C. M. (1989). A cis-acting mutation in the Sindbis virus junction region which affects subgenomic RNA synthesis. J. Virol. 63, 5216-5227.

18) Levis, R., Schlesinger, S., and Huang, H. V. (1990). Promoter for Sindbis virus RNA-dependent subgenomic RNA transcription. J. Virol. 64, 1726-1733.

19) Schlesinger, S. S., and Schlesinger, M. J. (1986). Formation and assembly of alphavirus glycoproteins. In The Togaviridae and Flaviviridae, Vol. Schlesinger, S. S., and Schlesinger, M. J., eds. (New York: Plenum Press), pp.121-148.

20) Hahn, C. S., and Strauss, J. H. (1990). Site-directed mutagenesis of the proposed catalytic amino acids of the Sindbis virus capsid protein autoprotease. J. Virol. 64, 3069-3073.

21) Melancon, P., and Garoff, H. (1987). Processing of the Semliki Forest virus structural polyprotein; Role of the capsid protease. J. Virol. 61, 1301-1309.

22) Bonatti, S., Migliaccio, G., Blobel, G., and Walter, P (1984). Role of the signal recognition particle in the membrane assembly of Sindbis viral glycoprotein. Eur. J. Biochem. 140, 499-502.

23) Garoff, H., Simons, K., and Dobberstein, B. (1978). Assembly of Semliki Forest virus membrane glycoproteins in the membrane of the endoplasmic reticulum in vitro. J. Mol. Biol. 124, 587-600.

24) Garoff, H., Huylebroeck, D., Robinson, A., Tillman, U., and Liljestrom, P. (1990). The signal sequence of the p62 protein of Semliki Forest virus is involved in initiation but not in completing chain translocation. J. Cell Biol. 111, 867-876.

25) Melancon, P., and Garoff, H. (1986). Reinitiation of translocation in the Semliki Forest virus structural polyprotein: Identification of the signal for the E1 glycoprotein. EMBO J. 5, 1551-1560.

26) Lobigs, M., Zhao, H., and Garoff, H. (1990b). Function of Semliki Forest virus E3 peptide in virus assembly: Replacement of E3 with an artificial signal peptide abolishes spike heterodimerization and surface expression of E1. J. Virol. 64, 4346-4355.

27) de Curtis, I., and Simons, K. (1988). Dissection of Semliki Forest virus glycoprotein delivery from the trans-Golgi network to the cell surface in permeabilized BHK cells. Proc. Natl. Acad. Sci. USA, 85, 8052-8056.

28) Helenius, A., Kielian, M., Mellman, I., and Schmid, S. (1989). Entry of enveloped viruses into their host cells. In Cell Biology of Virus Entry, Replication, and Pathogenesis, Vol. 90, Compans, R. W., Helenius, A., and Oldstone, M. B. A., eds. (New York: Alan R. Liss, Inc.), pp. 145-161.

29) Lobigs, M., and Garoff, H. (1990). Fusion function of the Semliki Forest virus spike is activated by proteolytic cleavage of the envelope glycoprotein p62. J. Virol. 64, 1233-1240.

30) Lobigs, M., Wahlberg, J. M., and Garoff, H. (1990a). Spike protein oligomerization control of Semliki Forest virus fusion. J. Virol. 64, 5214-5218.

31) Wahlberg, J. M., Boere, W. A., and Garoff, H. (1989). The heterodimeric association between the membrane proteins of Semliki Forest virus changes its sensitivity to mildly acidic pH during virus maturation. J. Virol. 63, 4991-4997.

32) Ziemiecki, A., Garoff, H., and Simons, K. (1980). Formation of the Semliki Forest virus membrane glycoprotein complexes in the infected cell. J. Gen. Virol. 50, 111-123.

33) Fuller, S. D. (1987). The T=4 envelope of Sindbis virus is organized by interactions with a complementary T=3 capsid. Cell 48, 923-934.

34) Wengler, G. (1980). Effects of alphaviruses on host cell macromolecular synthesis. In The Togaviruses, Vol. Schlesinger, R. W., eds. (New York: Academic Press, Inc.), pp. 459-472.

35) Stollar, V. (1980). Defective interfering alphaviruses. In The Togaviruses, Vol. Schlesinger, R. W., eds. (New York: Academic Press), pp. 427-457.

36) Boere, W. A. M., Harmsen, T., Vinje, J., Benaissa-Trouw, B. J., Kraaijeeveld, C. A., and Snippe. H. (1984). Identification of distinct antigenic determinants on Semliki Forest virus by using monoclonal antibodies with different antiviral activities. J. Virol. 52, 575-582.

37) Greiser-Wilke, I., Moennig, V., Kaaden, O.-R., and Figueiredo, L. T. M. (1989). Most alphaviruses share a conserved epitopic region on their nucleocapsid protein. J. Gen. Virol. 70, 743-748.

38) Kondor, K. C., Bravo, R., Fuller, S. D., Cutler, D., and Garoff, H. (1985). Exocytotic pathways exist to both the apical and the basolateral cell surface of the polarized epithelial cell MDCK. Cell 43, 297-306.

39) Sambrook, J., Fritsch, E. F., and Maniatis, T. (1989). Molecular Cloning. A Laboratory Manual. (Cold Spring Harbor: Cold spring Harbor Laboratory Press).

40) Benson, S. A. (1984). A rapid procedure for isolation of DNA fragments from agarose gels. Bio Techniques 2, 66-68.

41) Silhavy, T. J., Berman, M. L., and Enquist, L. W. (1984). Experiments with Gene Fusions. (New York: Cold Spring Harbor Laboratory Press).

42) Yanisch-Perron, C., Vieira, J., and Messing, J. (1985). Improved M13 phage cloning vectors and host strains: nucleotide sequences of the M13mp18 and pUC19 vectors. Gene 33, 103-119.

43) Chung, C. T., and Miller, R. T. (1988). A rapid and convenient method for the preparation and storage of competent bacterial cells. Nucl. Acids Res. 16, 3580.

44) Kunkel, T. A., Roberts, J. D., and Zakour, R. A. (1987). Rapid and efficient site-specific mutagenesis without phenotypic selection. Meth. Enzymol. 154, 367-382.

45) Su, T.-Z., and E1-Gewely, M. R. (1988). A multisite-directed mutagenesis using T7 DNA polymerase: application for reconstructing a mammalian gene. Gen 69, 81-89.

46) Krieg, P. A., and Melton, D. A. (1987). In vitro RNA synthesis with SP6 RNA polymerase. Meth. Enzymol. 155, 397-415.

47) Rice, C. M., Levis, R., Strauss, J. H., and Huang, H. V. (1987). Production of infectious RNA transcripts from Sindbis virus cDNA clones: Mapping of lethal mutations, rescue of a temperature-sensitive marker, and in vitro mutagenesis to generate defined mutants. J. Virol. 61, 3809-3819.

48) Cutler, D. F., and Garoff, H. (1986). Mutants of the membrane-binding region of Semliki Forest virus E2 protein. I. Cell surface transport and fusogenic activity. J. Cell Biol. 102, 889-901.

49) Chamberlain, J. P. (1979). Fluorographic detection of radioactivity in polyacrylamide gels with watersoluble fluor, sodium salicylate. Anal. Biochem. 98, 132-135.

50) Gubler, U., and Hoffman, B. J. (1983). A simple and very efficient method for generating cDNA libraries. Gene 25, 263-269.

51) Haymerle, H., Herz, J., Bressan, G. M., Frank, R., and Stanley, K. K. (1986). Efficient construction of cDNA libraries in plasmid expression vectors using an adaptor strategy. Nucl. Acids Res. 14, 8615-8124.

52) Davis, N. L., Willis, L. V., Smith, J. F., and Johnston, R. E. (1989). In vitro synthesis of infectious Venezuelan Equine Encephalitis virus RNA from a cDNA clone: Analysis of a viable deletion mutant. Virology 171, 189-204.

53) Niesters, H. G., and Strauss, J. H. (1990a). Defined mutations in the 5' nontranslated sequence of Sindbis virus RNA. J. Virol. 64, 4162-4168.

54) Niesters, H. G. M., and Strauss, J. H. (1990b). Mutagenesis of the conserved 511nucleotide region of Sindbis virus. J. Virol. 64, 1639-1647.

55) Tsiang, M., Weiss, B. G., and Schlesinger, S. (1988). Effects of 5'-terminal modifications on the biological activity of defective interfering RNAs of Sindbis virus. J. Virol. 62, 47-53.

56) Kuhn, R. J., Hong, Z., and Strauss, J. H. (1990). Mutagenesis of the 3' nontranslated region of Sindbis virus RNA. J. Virol. 64, 1465-1476.

57) Levis, R., Weiss, B. G., Tsiang, M., Huang, H., and Schlesinger, S. (1986). Deletion mapping of Sindbis virus DI RNAs derived from cDNAs defines the sequences essential for replication and packaging. Cell 44, 137-145.

58) Kozak, M. (1989). The scanning model for translation: an update. J. Cell Biol. 108, 229-241.

59) Weiss, B., Nitschko, H., Ghattas, I., Wright, R., and Schlesinger, S. (1989). Evidence for specificity in the encapsidation of Sindbis virus RNAs. J. Virol. 63, 5310-5318.

60) Davis N. L., Pence D. F., Meyer W. J., Schmaljohn A. L. and Johston R. E. (1987). Alternative forms of a strain-specific neutralizing antigenic site on the Sindbis virus E2 glycoprotein. Virology 161:101-108.

61) Mendoza Q. P., Stanley J. and Griffin D. E. (1988). Monoclonal antibodies to the E1 and E2 glycoproteins of Sindbis virus: Definition of epitopes and efficiency of protection from fatal encephalitis. J. Gen. Virol. 70:3015-3022.

62) Vrati S., Fernon C. A., Dalgarno L. and Weir R. C. (1988). Location of a major antigenic site involved in Ross River virus neutralization. Virology 162:346-353.

63) Grosfeld H., Velan B., Leitner M. Cohen S., Lustig S., Lachmi B. and Shafferman A. (1989). Semliki Forest virus E2 envelope epitopes induce a nonneutralizing humoral response which protects mice against lethal challenge. J. Virol. 63:3416-3422.

64) Zerial, M., Melancon, P., Schneider, C., and Garoff, H. (1986). The transmembrane segment of the human transferrin receptor functions as a signal peptide. EMBO J. 5, 1543-1550.

65) Schneider, C., Owen, M. J., Banville, D., and Williams, J. G. (1984). Primary structure of human transferrin receptor deduced from the mRNA sequence. Nature 311, 675-678.

66) Ratner L., Haseltine W., Patarca R., Livak K. J., Starcich B., Josephs S. F., Doran E. R., Rafalki J. A., Whitehorn E. A., Baumeister K., Ivanoff L., Petteway S. R., Pearson M. L., Lautenberger J. A., Papas T. S., Ghrayeb J., Chang N. T., Gallo R. C. and Wong-Staal F. (1985). Complete nucleotide sequence of the AIDS virus, HTLVIII. Nature 313:277-284.

67) AIDS (1988). Sci. Am. 259. A single-topic issue on HIV biology.

68) Desrosiers R. C., Wyand M. S., Kodama T., Ringler D. J., Arthur L. O., Sehgal P. K., Letvin N. L., King N. W. and Daniel M. D. (1989). Vaccine protection against simian immunodeficiency virus infection.

69) Ginsberg H., Brown F., Lerner R. A. and Chanoch R. M. (1988). Vaccines 1988. New chemical and genetic approaches to vaccination, Cold Spring Harbor Laboratory, 396 pp.

70) Burke K. L., Dunn G., Ferguson M., Minor P. D. and Almond J. W. (1988). Antigen chimeras of poliovirus as potential new vaccines. Nature 332:81-82.

71) Colbere-Garapin F., Christodoulou C., Crainic R., Garapin A.-C. and Candrea A. (1988). Addition of a foreign oligopeptide to the major capsid protein of poliovirus. Proc. Natl. Acad. Sci. USA 85:8668-8672.

72) Evans D. J., McKeating J., Meredith J. M., Burke K. L., Katrak K., John A., Ferguson M., Minor P. D., Weiss R. A. and Almond J. W. (1989). An engineered poliovirus chimaera elicits broadly reactive HIV-1 neutralizing antibodies. Nature 339:385-388.

    __________________________________________________________________________     SEQUENCE LISTING                                                               (1) GENERAL INFORMATION:                                                       (iii) NUMBER OF SEQUENCES: 27                                                  (2) INFORMATION FOR SEQ ID NO:1:                                               (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 11517 base pairs                                                   (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: RNA (genomic)                                              (iii) HYPOTHETICAL: NO                                                         (iv) ANTI-SENSE: NO                                                            (vi) ORIGINAL SOURCE:                                                          (A) ORGANISM: Semliki Forest Virus                                             (ix) FEATURE:                                                                  (A) NAME/KEY: -                                                                (B) LOCATION: 1..11517                                                         (D) OTHER INFORMATION: /label=genome                                           /note="Semliki Forest Virus complete nucleotide                                sequence, presented as a cloned DNA sequence; see                              Figure 5."                                                                     (ix) FEATURE:                                                                  (A) NAME/KEY: CDS                                                              (B) LOCATION: 87..7379                                                         (D) OTHER INFORMATION: /product="SFV polyprotein"                              (ix) FEATURE:                                                                  (A) NAME/KEY: CDS                                                              (B) LOCATION: 7421..11179                                                      (D) OTHER INFORMATION: /product="SFV polyprotein"                              (xi) SEQUENCE DESCRIPTION: SEQ ID NO:1:                                        GATGGCGGATGTGTGACATACACGACGCCAAAAGATTTTGTTCCAGCTCCTGCCACCTCC60                 GCTACGCGAGAGATTAACCACCCACGATGGCCGCCAAAGTGCATGTTGATATT113                       MetAlaAlaLysValHisValAspIle                                                    15                                                                             GAGGCTGACAGCCCATTCATCAAGTCTTTGCAGAAGGCATTTCCGTCG161                            GluAlaAspSerProPheIleLysSerLeuGlnLysAlaPheProSer                               10152025                                                                       TTCGAGGTGGAGTCATTGCAGGTCACACCAAATGACCATGCAAATGCC209                            PheGluValGluSerLeuGlnValThrProAsnAspHisAlaAsnAla                               303540                                                                         AGAGCATTTTCGCACCTGGCTACCAAATTGATCGAGCAGGAGACTGAC257                            ArgAlaPheSerHisLeuAlaThrLysLeuIleGluGlnGluThrAsp                               455055                                                                         AAAGACACACTCATCTTGGATATCGGCAGTGCGCCTTCCAGGAGAATG305                            LysAspThrLeuIleLeuAspIleGlySerAlaProSerArgArgMet                               606570                                                                         ATGTCTACGCACAAATACCACTGCGTATGCCCTATGCGCAGCGCAGAA353                            MetSerThrHisLysTyrHisCysValCysProMetArgSerAlaGlu                               758085                                                                         GACCCCGAAAGGCTCGATAGCTACGCAAAGAAACTGGCAGCGGCCTCC401                            AspProGluArgLeuAspSerTyrAlaLysLysLeuAlaAlaAlaSer                               9095100105                                                                     GGGAAGGTGCTGGATAGAGAGATCGCAGGAAAAATCACCGACCTGCAG449                            GlyLysValLeuAspArgGluIleAlaGlyLysIleThrAspLeuGln                               110115120                                                                      ACCGTCATGGCTACGCCAGACGCTGAATCTCCTACCTTTTGCCTGCAT497                            ThrValMetAlaThrProAspAlaGluSerProThrPheCysLeuHis                               125130135                                                                      ACAGACGTCACGTGTCGTACGGCAGCCGAAGTGGCCGTATACCAGGAC545                            ThrAspValThrCysArgThrAlaAlaGluValAlaValTyrGlnAsp                               140145150                                                                      GTGTATGCTGTACATGCACCAACATCGCTGTACCATCAGGCGATGAAA593                            ValTyrAlaValHisAlaProThrSerLeuTyrHisGlnAlaMetLys                               155160165                                                                      GGTGTCAGAACGGCGTATTGGATTGGGTTTGACACCACCCCGTTTATG641                            GlyValArgThrAlaTyrTrpIleGlyPheAspThrThrProPheMet                               170175180185                                                                   TTTGACGCGCTAGCAGGCGCGTATCCAACCTACGCCACAAACTGGGCC689                            PheAspAlaLeuAlaGlyAlaTyrProThrTyrAlaThrAsnTrpAla                               190195200                                                                      GACGAGCAGGTGTTACAGGCCAGGAACATAGGACTGTGTGCAGCATCC737                            AspGluGlnValLeuGlnAlaArgAsnIleGlyLeuCysAlaAlaSer                               205210215                                                                      TTGACTGAGGGAAGACTCGGCAAACTGTCCATTCTCCGCAAGAAGCAA785                            LeuThrGluGlyArgLeuGlyLysLeuSerIleLeuArgLysLysGln                               220225230                                                                      TTGAAACCTTGCGACACAGTCATGTTCTCGGTAGGATCTACATTGTAC833                            LeuLysProCysAspThrValMetPheSerValGlySerThrLeuTyr                               235240245                                                                      ACTGAGAGCAGAAAGCTACTGAGGAGCTGGCACTTACCCTCCGTATTC881                            ThrGluSerArgLysLeuLeuArgSerTrpHisLeuProSerValPhe                               250255260265                                                                   CACCTGAAAGGTAAACAATCCTTTACCTGTAGGTGCGATACCATCGTA929                            HisLeuLysGlyLysGlnSerPheThrCysArgCysAspThrIleVal                               270275280                                                                      TCATGTGAAGGGTACGTAGTTAAGAAAATCACTATGTGCCCCGGCCTG977                            SerCysGluGlyTyrValValLysLysIleThrMetCysProGlyLeu                               285290295                                                                      TACGGTAAAACGGTAGGGTACGCCGTGACGTATCACGCGGAGGGATTC1025                           TyrGlyLysThrValGlyTyrAlaValThrTyrHisAlaGluGlyPhe                               300305310                                                                      CTAGTGTGCAAGACCACAGACACTGTCAAAGGAGAAAGAGTCTCATTC1073                           LeuValCysLysThrThrAspThrValLysGlyGluArgValSerPhe                               315320325                                                                      CCTGTATGCACCTACGTCCCCTCAACCATCTGTGATCAAATGACTGGC1121                           ProValCysThrTyrValProSerThrIleCysAspGlnMetThrGly                               330335340345                                                                   ATACTAGCGACCGACGTCACACCGGAGGACGCACAGAAGTTGTTAGTG1169                           IleLeuAlaThrAspValThrProGluAspAlaGlnLysLeuLeuVal                               350355360                                                                      GGATTGAATCAGAGGATAGTTGTGAACGGAAGAACACAGCGAAACACT1217                           GlyLeuAsnGlnArgIleValValAsnGlyArgThrGlnArgAsnThr                               365370375                                                                      AACACGATGAAGAACTATCTGCTTCCGATTGTGGCCGTCGCATTTAGC1265                           AsnThrMetLysAsnTyrLeuLeuProIleValAlaValAlaPheSer                               380385390                                                                      AAGTGGGCGAGGGAATACAAGGCAGACCTTGATGATGAAAAACCTCTG1313                           LysTrpAlaArgGluTyrLysAlaAspLeuAspAspGluLysProLeu                               395400405                                                                      GGTGTCCGAGAGAGGTCACTTACTTGCTGCTGCTTGTGGGCATTTAAA1361                           GlyValArgGluArgSerLeuThrCysCysCysLeuTrpAlaPheLys                               410415420425                                                                   ACGAGGAAGATGCACACCATGTACAAGAAACCAGACACCCAGACAATA1409                           ThrArgLysMetHisThrMetTyrLysLysProAspThrGlnThrIle                               430435440                                                                      GTGAAGGTGCCTTCAGAGTTTAACTCGTTCGTCATCCCGAGCCTATGG1457                           ValLysValProSerGluPheAsnSerPheValIleProSerLeuTrp                               445450455                                                                      TCTACAGGCCTCGCAATCCCAGTCAGATCACGCATTAAGATGCTTTTG1505                           SerThrGlyLeuAlaIleProValArgSerArgIleLysMetLeuLeu                               460465470                                                                      GCCAAGAAGACCAAGCGAGAGTTAATACCTGTTCTCGACGCGTCGTCA1553                           AlaLysLysThrLysArgGluLeuIleProValLeuAspAlaSerSer                               475480485                                                                      GCCAGGGATGCTGAACAAGAGGAGAAGGAGAGGTTGGAGGCCGAGCTG1601                           AlaArgAspAlaGluGlnGluGluLysGluArgLeuGluAlaGluLeu                               490495500505                                                                   ACTAGAGAAGCCTTACCACCCCTCGTCCCCATCGCGCCGGCGGAGACG1649                           ThrArgGluAlaLeuProProLeuValProIleAlaProAlaGluThr                               510515520                                                                      GGAGTCGTCGACGTCGACGTTGAAGAACTAGAGTATCACGCAGGTGCA1697                           GlyValValAspValAspValGluGluLeuGluTyrHisAlaGlyAla                               525530535                                                                      GGGGTCGTGGAAACACCTCGCAGCGCGTTGAAAGTCACCGCACAGCCG1745                           GlyValValGluThrProArgSerAlaLeuLysValThrAlaGlnPro                               540545550                                                                      AACGACGTACTACTAGGAAATTACGTAGTTCTGTCCCCGCAGACCGTG1793                           AsnAspValLeuLeuGlyAsnTyrValValLeuSerProGlnThrVal                               555560565                                                                      CTCAAGAGCTCCAAGTTGGCCCCCGTGCACCCTCTAGCAGAGCAGGTG1841                           LeuLysSerSerLysLeuAlaProValHisProLeuAlaGluGlnVal                               570575580585                                                                   AAAATAATAACACATAACGGGAGGGCCGGCGGTTACCAGGTCGACGGA1889                           LysIleIleThrHisAsnGlyArgAlaGlyGlyTyrGlnValAspGly                               590595600                                                                      TATGACGGCAGGGTCCTACTACCATGTGGATCGGCCATTCCGGTCCCT1937                           TyrAspGlyArgValLeuLeuProCysGlySerAlaIleProValPro                               605610615                                                                      GAGTTTCAAGCTTTGAGCGAGAGCGCCACTATGGTGTACAACGAAAGG1985                           GluPheGlnAlaLeuSerGluSerAlaThrMetValTyrAsnGluArg                               620625630                                                                      GAGTTCGTCAACAGGAAACTATACCATATTGCCGTTCACGGACCGTCG2033                           GluPheValAsnArgLysLeuTyrHisIleAlaValHisGlyProSer                               635640645                                                                      CTGAACACCGACGAGGAGAACTACGAGAAAGTCAGAGCTGAAAGAACT2081                           LeuAsnThrAspGluGluAsnTyrGluLysValArgAlaGluArgThr                               650655660665                                                                   GACGCCGAGTACGTGTTCGACGTAGATAAAAAATGCTGCGTCAAGAGA2129                           AspAlaGluTyrValPheAspValAspLysLysCysCysValLysArg                               670675680                                                                      GAGGAAGCGTCGGGTTTGGTGTTGGTGGGAGAGCTAACCAACCCCCCG2177                           GluGluAlaSerGlyLeuValLeuValGlyGluLeuThrAsnProPro                               685690695                                                                      TTCCATGAATTCGCCTACGAAGGGCTGAAGATCAGGCCGTCGGCACCA2225                           PheHisGluPheAlaTyrGluGlyLeuLysIleArgProSerAlaPro                               700705710                                                                      TATAAGACTACAGTAGTAGGAGTCTTTGGGGTTCCGGGATCAGGCAAG2273                           TyrLysThrThrValValGlyValPheGlyValProGlySerGlyLys                               715720725                                                                      TCTGCTATTATTAAGAGCCTCGTGACCAAACACGATCTGGTCACCAGC2321                           SerAlaIleIleLysSerLeuValThrLysHisAspLeuValThrSer                               730735740745                                                                   GGCAAGAAGGAGAACTGCCAGGAAATAGTTAACGACGTGAAGAAGCAC2369                           GlyLysLysGluAsnCysGlnGluIleValAsnAspValLysLysHis                               750755760                                                                      CGCGGGAAGGGGACAAGTAGGGAAAACAGTGACTCCATCCTGCTAAAC2417                           ArgGlyLysGlyThrSerArgGluAsnSerAspSerIleLeuLeuAsn                               765770775                                                                      GGGTGTCGTCGTGCCGTGGACATCCTATATGTGGACGAGGCTTTCGCT2465                           GlyCysArgArgAlaValAspIleLeuTyrValAspGluAlaPheAla                               780785790                                                                      TGCCATTCCGGTACTCTGCTGGCCCTAATTGCTCTTGTTAAACCTCGG2513                           CysHisSerGlyThrLeuLeuAlaLeuIleAlaLeuValLysProArg                               795800805                                                                      AGCAAAGTGGTGTTATGCGGAGACCCCAAGCAATGCGGATTCTTCAAT2561                           SerLysValValLeuCysGlyAspProLysGlnCysGlyPhePheAsn                               810815820825                                                                   ATGATGCAGCTTAAGGTGAACTTCAACCACAACATCTGCACTGAAGTA2609                           MetMetGlnLeuLysValAsnPheAsnHisAsnIleCysThrGluVal                               830835840                                                                      TGTCATAAAAGTATATCCAGACGTTGCACGCGTCCAGTCACGGCCATC2657                           CysHisLysSerIleSerArgArgCysThrArgProValThrAlaIle                               845850855                                                                      GTGTCTACGTTGCACTACGGAGGCAAGATGCGCACGACCAACCCGTGC2705                           ValSerThrLeuHisTyrGlyGlyLysMetArgThrThrAsnProCys                               860865870                                                                      AACAAACCCATAATCATAGACACCACAGGACAGACCAAGCCCAAGCCA2753                           AsnLysProIleIleIleAspThrThrGlyGlnThrLysProLysPro                               875880885                                                                      GGAGACATCGTGTTAACATGCTTCCGAGGCTGGGCAAAGCAGCTGCAG2801                           GlyAspIleValLeuThrCysPheArgGlyTrpAlaLysGlnLeuGln                               890895900905                                                                   TTGGACTACCGTGGACACGAAGTCATGACAGCAGCAGCATCTCAGGGC2849                           LeuAspTyrArgGlyHisGluValMetThrAlaAlaAlaSerGlnGly                               910915920                                                                      CTCACCCGCAAAGGGGTATACGCCGTAAGGCAGAAGGTGAATGAAAAT2897                           LeuThrArgLysGlyValTyrAlaValArgGlnLysValAsnGluAsn                               925930935                                                                      CCCTTGTATGCCCCTGCGTCGGAGCACGTGAATGTACTGCTGACGCGC2945                           ProLeuTyrAlaProAlaSerGluHisValAsnValLeuLeuThrArg                               940945950                                                                      ACTGAGGATAGGCTGGTGTGGAAAACGCTGGCCGGCGATCCCTGGATT2993                           ThrGluAspArgLeuValTrpLysThrLeuAlaGlyAspProTrpIle                               955960965                                                                      AAGGTCCTATCAAACATTCCACAGGGTAACTTTACGGCCACATTGGAA3041                           LysValLeuSerAsnIleProGlnGlyAsnPheThrAlaThrLeuGlu                               970975980985                                                                   GAATGGCAAGAAGAACACGACAAAATAATGAAGGTGATTGAAGGACCG3089                           GluTrpGlnGluGluHisAspLysIleMetLysValIleGluGlyPro                               9909951000                                                                     GCTGCGCCTGTGGACGCGTTCCAGAACAAAGCGAACGTGTGTTGGGCG3137                           AlaAlaProValAspAlaPheGlnAsnLysAlaAsnValCysTrpAla                               100510101015                                                                   AAAAGCCTGGTGCCTGTCCTGGACACTGCCGGAATCAGATTGACAGCA3185                           LysSerLeuValProValLeuAspThrAlaGlyIleArgLeuThrAla                               102010251030                                                                   GAGGAGTGGAGCACCATAATTACAGCATTTAAGGAGGACAGAGCTTAC3233                           GluGluTrpSerThrIleIleThrAlaPheLysGluAspArgAlaTyr                               103510401045                                                                   TCTCCAGTGGTGGCCTTGAATGAAATTTGCACCAAGTACTATGGAGTT3281                           SerProValValAlaLeuAsnGluIleCysThrLysTyrTyrGlyVal                               1050105510601065                                                               GACCTGGACAGTGGCCTGTTTTCTGCCCCGAAGGTGTCCCTGTATTAC3329                           AspLeuAspSerGlyLeuPheSerAlaProLysValSerLeuTyrTyr                               107010751080                                                                   GAGAACAACCACTGGGATAACAGACCTGGTGGAAGGATGTATGGATTC3377                           GluAsnAsnHisTrpAspAsnArgProGlyGlyArgMetTyrGlyPhe                               108510901095                                                                   AATGCCGCAACAGCTGCCAGGCTGGAAGCTAGACATACCTTCCTGAAG3425                           AsnAlaAlaThrAlaAlaArgLeuGluAlaArgHisThrPheLeuLys                               110011051110                                                                   GGGCAGTGGCATACGGGCAAGCAGGCAGTTATCGCAGAAAGAAAAATC3473                           GlyGlnTrpHisThrGlyLysGlnAlaValIleAlaGluArgLysIle                               111511201125                                                                   CAACCGCTTTCTGTGCTGGACAATGTAATTCCTATCAACCGCAGGCTG3521                           GlnProLeuSerValLeuAspAsnValIleProIleAsnArgArgLeu                               1130113511401145                                                               CCGCACGCCCTGGTGGCTGAGTACAAGACGGTTAAAGGCAGTAGGGTT3569                           ProHisAlaLeuValAlaGluTyrLysThrValLysGlySerArgVal                               115011551160                                                                   GAGTGGCTGGTCAATAAAGTAAGAGGGTACCACGTCCTGCTGGTGAGT3617                           GluTrpLeuValAsnLysValArgGlyTyrHisValLeuLeuValSer                               116511701175                                                                   GAGTACAACCTGGCTTTGCCTCGACGCAGGGTCACTTGGTTGTCACCG3665                           GluTyrAsnLeuAlaLeuProArgArgArgValThrTrpLeuSerPro                               118011851190                                                                   CTGAATGTCACAGGCGCCGATAGGTGCTACGACCTAAGTTTAGGACTG3713                           LeuAsnValThrGlyAlaAspArgCysTyrAspLeuSerLeuGlyLeu                               119512001205                                                                   CCGGCTGACGCCGGCAGGTTCGACTTGGTCTTTGTGAACATTCACACG3761                           ProAlaAspAlaGlyArgPheAspLeuValPheValAsnIleHisThr                               1210121512201225                                                               GAATTCAGAATCCACCACTACCAGCAGTGTGTCGACCACGCCATGAAG3809                           GluPheArgIleHisHisTyrGlnGlnCysValAspHisAlaMetLys                               123012351240                                                                   CTGCAGATGCTTGGGGGAGATGCGCTACGACTGCTAAAACCCGGCGGC3857                           LeuGlnMetLeuGlyGlyAspAlaLeuArgLeuLeuLysProGlyGly                               124512501255                                                                   ATCTTGATGAGAGCTTACGGATACGCCGATAAAATCAGCGAAGCCGTT3905                           IleLeuMetArgAlaTyrGlyTyrAlaAspLysIleSerGluAlaVal                               126012651270                                                                   GTTTCCTCCTTAAGCAGAAAGTTCTCGTCTGCAAGAGTGTTGCGCCCG3953                           ValSerSerLeuSerArgLysPheSerSerAlaArgValLeuArgPro                               127512801285                                                                   GATTGTGTCACCAGCAATACAGAAGTGTTCTTGCTGTTCTCCAACTTT4001                           AspCysValThrSerAsnThrGluValPheLeuLeuPheSerAsnPhe                               1290129513001305                                                               GACAACGGAAAGAGACCCTCTACGCTACACCAGATGAATACCAAGCTG4049                           AspAsnGlyLysArgProSerThrLeuHisGlnMetAsnThrLysLeu                               131013151320                                                                   AGTGCCGTGTATGCCGGAGAAGCCATGCACACGGCCGGGTGTGCACCA4097                           SerAlaValTyrAlaGlyGluAlaMetHisThrAlaGlyCysAlaPro                               132513301335                                                                   TCCTACAGAGTTAAGAGAGCAGACATAGCCACGTGCACAGAAGCGGCT4145                           SerTyrArgValLysArgAlaAspIleAlaThrCysThrGluAlaAla                               134013451350                                                                   GTGGTTAACGCAGCTAACGCCCGTGGAACTGTAGGGGATGGCGTATGC4193                           ValValAsnAlaAlaAsnAlaArgGlyThrValGlyAspGlyValCys                               135513601365                                                                   AGGGCCGTGGCGAAGAAATGGCCGTCAGCCTTTAAGGGAGCAGCAACA4241                           ArgAlaValAlaLysLysTrpProSerAlaPheLysGlyAlaAlaThr                               1370137513801385                                                               CCAGTGGGCACAATTAAAACAGTCATGTGCGGCTCGTACCCCGTCATC4289                           ProValGlyThrIleLysThrValMetCysGlySerTyrProValIle                               139013951400                                                                   CACGCTGTAGCGCCTAATTTCTCTGCCACGACTGAAGCGGAAGGGGAC4337                           HisAlaValAlaProAsnPheSerAlaThrThrGluAlaGluGlyAsp                               140514101415                                                                   CGCGAATTGGCCGCTGTCTACCGGGCAGTGGCCGCCGAAGTAAACAGA4385                           ArgGluLeuAlaAlaValTyrArgAlaValAlaAlaGluValAsnArg                               142014251430                                                                   CTGTCACTGAGCAGCGTAGCCATCCCGCTGCTGTCCACAGGAGTGTTC4433                           LeuSerLeuSerSerValAlaIleProLeuLeuSerThrGlyValPhe                               143514401445                                                                   AGCGGCGGAAGAGATAGGCTGCAGCAATCCCTCAACCATCTATTCACA4481                           SerGlyGlyArgAspArgLeuGlnGlnSerLeuAsnHisLeuPheThr                               1450145514601465                                                               GCAATGGACGCCACGGACGCTGACGTGACCATCTACTGCAGAGACAAA4529                           AlaMetAspAlaThrAspAlaAspValThrIleTyrCysArgAspLys                               147014751480                                                                   AGTTGGGAGAAGAAAATCCAGGAAGCCATTGACATGAGGACGGCTGTG4577                           SerTrpGluLysLysIleGlnGluAlaIleAspMetArgThrAlaVal                               148514901495                                                                   GAGTTGCTCAATGATGACGTGGAGCTGACCACAGACTTGGTGAGAGTG4625                           GluLeuLeuAsnAspAspValGluLeuThrThrAspLeuValArgVal                               150015051510                                                                   CACCCGGACAGCAGCCTGGTGGGTCGTAAGGGCTACAGTACCACTGAC4673                           HisProAspSerSerLeuValGlyArgLysGlyTyrSerThrThrAsp                               151515201525                                                                   GGGTCGCTGTACTCGTACTTTGAAGGTACGAAATTCAACCAGGCTGCT4721                           GlySerLeuTyrSerTyrPheGluGlyThrLysPheAsnGlnAlaAla                               1530153515401545                                                               ATTGATATGGCAGAGATACTGACGTTGTGGCCCAGACTGCAAGAGGCA4769                           IleAspMetAlaGluIleLeuThrLeuTrpProArgLeuGlnGluAla                               155015551560                                                                   AACGAACAGATATGCCTATACGCGCTGGGCGAAACAATGGACAACATC4817                           AsnGluGlnIleCysLeuTyrAlaLeuGlyGluThrMetAspAsnIle                               156515701575                                                                   AGATCCAAATGTCCGGTGAACGATTCCGATTCATCAACACCTCCCAGG4865                           ArgSerLysCysProValAsnAspSerAspSerSerThrProProArg                               158015851590                                                                   ACAGTGCCCTGCCTGTGCCGCTACGCAATGACAGCAGAACGGATCGCC4913                           ThrValProCysLeuCysArgTyrAlaMetThrAlaGluArgIleAla                               159516001605                                                                   CGCCTTAGGTCACACCAAGTTAAAAGCATGGTGGTTTGCTCATCTTTT4961                           ArgLeuArgSerHisGlnValLysSerMetValValCysSerSerPhe                               1610161516201625                                                               CCCCTCCCGAAATACCATGTAGATGGGGTGCAGAAGGTAAAGTGCGAG5009                           ProLeuProLysTyrHisValAspGlyValGlnLysValLysCysGlu                               163016351640                                                                   AAGGTTCTCCTGTTCGACCCGACGGTACCTTCAGTGGTTAGTCCGCGG5057                           LysValLeuLeuPheAspProThrValProSerValValSerProArg                               164516501655                                                                   AAGTATGCCGCATCTACGACGGACCACTCAGATCGGTCGTTACGAGGG5105                           LysTyrAlaAlaSerThrThrAspHisSerAspArgSerLeuArgGly                               166016651670                                                                   TTTGACTTGGACTGGACCACCGACTCGTCTTCCACTGCCAGCGATACC5153                           PheAspLeuAspTrpThrThrAspSerSerSerThrAlaSerAspThr                               167516801685                                                                   ATGTCGCTACCCAGTTTGCAGTCGTGTGACATCGACTCGATCTACGAG5201                           MetSerLeuProSerLeuGlnSerCysAspIleAspSerIleTyrGlu                               1690169517001705                                                               CCAATGGCTCCCATAGTAGTGACGGCTGACGTACACCCTGAACCCGCA5249                           ProMetAlaProIleValValThrAlaAspValHisProGluProAla                               171017151720                                                                   GGCATCGCGGACCTGGCGGCAGATGTGCACCCTGAACCCGCAGACCAT5297                           GlyIleAlaAspLeuAlaAlaAspValHisProGluProAlaAspHis                               172517301735                                                                   GTGGACCTCGAGAACCCGATTCCTCCACCGCGCCCGAAGAGAGCTGCA5345                           ValAspLeuGluAsnProIleProProProArgProLysArgAlaAla                               174017451750                                                                   TACCTTGCCTCCCGCGCGGCGGAGCGACCGGTGCCGGCGCCGAGAAAG5393                           TyrLeuAlaSerArgAlaAlaGluArgProValProAlaProArgLys                               175517601765                                                                   CCGACGCCTGCCCCAAGGACTGCGTTTAGGAACAAGCTGCCTTTGACG5441                           ProThrProAlaProArgThrAlaPheArgAsnLysLeuProLeuThr                               1770177517801785                                                               TTCGGCGACTTTGACGAGCACGAGGTCGATGCGTTGGCCTCCGGGATT5489                           PheGlyAspPheAspGluHisGluValAspAlaLeuAlaSerGlyIle                               179017951800                                                                   ACTTTCGGAGACTTCGACGACGTCCTGCGACTAGGCCGCGCGGGTGCA5537                           ThrPheGlyAspPheAspAspValLeuArgLeuGlyArgAlaGlyAla                               180518101815                                                                   TATATTTTCTCCTCGGACACTGGCAGCGGACATTTACAACAAAAATCC5585                           TyrIlePheSerSerAspThrGlySerGlyHisLeuGlnGlnLysSer                               182018251830                                                                   GTTAGGCAGCACAATCTCCAGTGCGCACAACTGGATGCGGTCCAGGAG5633                           ValArgGlnHisAsnLeuGlnCysAlaGlnLeuAspAlaValGlnGlu                               183518401845                                                                   GAGAAAATGTACCCGCCAAAATTGGATACTGAGAGGGAGAAGCTGTTG5681                           GluLysMetTyrProProLysLeuAspThrGluArgGluLysLeuLeu                               1850185518601865                                                               CTGCTGAAAATGCAGATGCACCCATCGGAGGCTAATAAGAGTCGATAC5729                           LeuLeuLysMetGlnMetHisProSerGluAlaAsnLysSerArgTyr                               187018751880                                                                   CAGTCTCGCAAAGTGGAGAACATGAAAGCCACGGTGGTGGACAGGCTC5777                           GlnSerArgLysValGluAsnMetLysAlaThrValValAspArgLeu                               188518901895                                                                   ACATCGGGGGCCAGATTGTACACGGGAGCGGACGTAGGCCGCATACCA5825                           ThrSerGlyAlaArgLeuTyrThrGlyAlaAspValGlyArgIlePro                               190019051910                                                                   ACATACGCGGTTCGGTACCCCCGCCCCGTGTACTCCCCTACCGTGATC5873                           ThrTyrAlaValArgTyrProArgProValTyrSerProThrValIle                               191519201925                                                                   GAAAGATTCTCAAGCCCCGATGTAGCAATCGCAGCGTGCAACGAATAC5921                           GluArgPheSerSerProAspValAlaIleAlaAlaCysAsnGluTyr                               1930193519401945                                                               CTATCCAGAAATTACCCAACAGTGGCGTCGTACCAGATAACAGATGAA5969                           LeuSerArgAsnTyrProThrValAlaSerTyrGlnIleThrAspGlu                               195019551960                                                                   TACGACGCATACTTGGACATGGTTGACGGGTCGGATAGTTGCTTGGAC6017                           TyrAspAlaTyrLeuAspMetValAspGlySerAspSerCysLeuAsp                               196519701975                                                                   AGAGCGACATTCTGCCCGGCGAAGCTCCGGTGCTACCCGAAACATCAT6065                           ArgAlaThrPheCysProAlaLysLeuArgCysTyrProLysHisHis                               198019851990                                                                   GCGTACCACCAGCCGACTGTACGCAGTGCCGTCCCGTCACCCTTTCAG6113                           AlaTyrHisGlnProThrValArgSerAlaValProSerProPheGln                               199520002005                                                                   AACACACTACAGAACGTGCTAGCGGCCGCCACCAAGAGAAACTGCAAC6161                           AsnThrLeuGlnAsnValLeuAlaAlaAlaThrLysArgAsnCysAsn                               2010201520202025                                                               GTCACGCAAATGCGAGAACTACCCACCATGGACTCGGCAGTGTTCAAC6209                           ValThrGlnMetArgGluLeuProThrMetAspSerAlaValPheAsn                               203020352040                                                                   GTGGAGTGCTTCAAGCGCTATGCCTGCTCCGGAGAATATTGGGAAGAA6257                           ValGluCysPheLysArgTyrAlaCysSerGlyGluTyrTrpGluGlu                               204520502055                                                                   TATGCTAAACAACCTATCCGGATAACCACTGAGAACATCACTACCTAT6305                           TyrAlaLysGlnProIleArgIleThrThrGluAsnIleThrThrTyr                               206020652070                                                                   GTGACCAAATTGAAAGGCCCGAAAGCTGCTGCCTTGTTCGCTAAGACC6353                           ValThrLysLeuLysGlyProLysAlaAlaAlaLeuPheAlaLysThr                               207520802085                                                                   CACAACTTGGTTCCGCTGCAGGAGGTTCCCATGGACAGATTCACGGTC6401                           HisAsnLeuValProLeuGlnGluValProMetAspArgPheThrVal                               2090209521002105                                                               GACATGAAACGAGATGTCAAAGTCACTCCAGGGACGAAACACACAGAG6449                           AspMetLysArgAspValLysValThrProGlyThrLysHisThrGlu                               211021152120                                                                   GAAAGACCCAAAGTCCAGGTAATTCAAGCAGCGGAGCCATTGGCGACC6497                           GluArgProLysValGlnValIleGlnAlaAlaGluProLeuAlaThr                               212521302135                                                                   GCTTACCTGTGCGGCATCCACAGGGAATTAGTAAGGAGACTAAATGCT6545                           AlaTyrLeuCysGlyIleHisArgGluLeuValArgArgLeuAsnAla                               214021452150                                                                   GTGTTACGCCCTAACGTGCACACATTGTTTGATATGTCGGCCGAAGAC6593                           ValLeuArgProAsnValHisThrLeuPheAspMetSerAlaGluAsp                               215521602165                                                                   TTTGACGCGATCATCGCCTCTCACTTCCACCCAGGAGACCCGGTTCTA6641                           PheAspAlaIleIleAlaSerHisPheHisProGlyAspProValLeu                               2170217521802185                                                               GAGACGGACATTGCATCATTCGACAAAAGCCAGGACGACTCCTTGGCT6689                           GluThrAspIleAlaSerPheAspLysSerGlnAspAspSerLeuAla                               219021952200                                                                   CTTACAGGTTTAATGATCCTCGAAGATCTAGGGGTGGATCAGTACCTG6737                           LeuThrGlyLeuMetIleLeuGluAspLeuGlyValAspGlnTyrLeu                               220522102215                                                                   CTGGACTTGATCGAGGCAGCCTTTGGGGAAATATCCAGCTGTCACCTA6785                           LeuAspLeuIleGluAlaAlaPheGlyGluIleSerSerCysHisLeu                               222022252230                                                                   CCAACTGGCACGCGCTTCAAGTTCGGAGCTATGATGAAATCGGGCATG6833                           ProThrGlyThrArgPheLysPheGlyAlaMetMetLysSerGlyMet                               223522402245                                                                   TTTCTGACTTTGTTTATTAACACTGTTTTGAACATCACCATAGCAAGC6881                           PheLeuThrLeuPheIleAsnThrValLeuAsnIleThrIleAlaSer                               2250225522602265                                                               AGGGTACTGGAGCAGAGACTCACTGACTCCGCCTGTGCGGCCTTCATC6929                           ArgValLeuGluGlnArgLeuThrAspSerAlaCysAlaAlaPheIle                               227022752280                                                                   GGCGACGACAACATCGTTCACGGAGTGATCTCCGACAAGCTGATGGCG6977                           GlyAspAspAsnIleValHisGlyValIleSerAspLysLeuMetAla                               228522902295                                                                   GAGAGGTGCGCGTCGTGGGTCAACATGGAGGTGAAGATCATTGACGCT7025                           GluArgCysAlaSerTrpValAsnMetGluValLysIleIleAspAla                               230023052310                                                                   GTCATGGGCGAAAAACCCCCATATTTTTGTGGGGGATTCATAGTTTTT7073                           ValMetGlyGluLysProProTyrPheCysGlyGlyPheIleValPhe                               231523202325                                                                   GACAGCGTCACACAGACCGCCTGCCGTGTTTCAGACCCACTTAAGCGC7121                           AspSerValThrGlnThrAlaCysArgValSerAspProLeuLysArg                               2330233523402345                                                               CTGTTCAAGTTGGGTAAGCCGCTAACAGCTGAAGACAAGCAGGACGAA7169                           LeuPheLysLeuGlyLysProLeuThrAlaGluAspLysGlnAspGlu                               235023552360                                                                   GACAGGCGACGAGCACTGAGTGACGAGGTTAGCAAGTGGTTCCGGACA7217                           AspArgArgArgAlaLeuSerAspGluValSerLysTrpPheArgThr                               236523702375                                                                   GGCTTGGGGGCCGAACTGGAGGTGGCACTAACATCTAGGTATGAGGTA7265                           GlyLeuGlyAlaGluLeuGluValAlaLeuThrSerArgTyrGluVal                               238023852390                                                                   GAGGGCTGCAAAAGTATCCTCATAGCCATGACCACCTTGGCGAGGGAC7313                           GluGlyCysLysSerIleLeuIleAlaMetThrThrLeuAlaArgAsp                               239524002405                                                                   ATTAAGGCGTTTAAGAAATTGAGAGGACCTGTTATACACCTCTACGGC7361                           IleLysAlaPheLysLysLeuArgGlyProValIleHisLeuTyrGly                               2410241524202425                                                               GGTCCTAGATTGGTGCGTTAATACACAGAATTCTGATTATAGCGCACT7409                           GlyProArgLeuValArg                                                             2430                                                                           ATTATAGCACCATGAATTACATCCCTACGCAAACGTTTTACGGCCGCCGG7459                         MetAsnTyrIleProThrGlnThrPheTyrGlyArgArg                                        1510                                                                           TGGCGCCCGCGCCCGGCGGCCCGTCCTTGGCCGTTGCAGGCCACTCCG7507                           TrpArgProArgProAlaAlaArgProTrpProLeuGlnAlaThrPro                               152025                                                                         GTGGCTCCCGTCGTCCCCGACTTCCAGGCCCAGCAGATGCAGCAACTC7555                           ValAlaProValValProAspPheGlnAlaGlnGlnMetGlnGlnLeu                               30354045                                                                       ATCAGCGCCGTAAATGCGCTGACAATGAGACAGAACGCAATTGCTCCT7603                           IleSerAlaValAsnAlaLeuThrMetArgGlnAsnAlaIleAlaPro                               505560                                                                         GCTAGGCCTCCCAAACCAAAGAAGAAGAAGACAACCAAACCAAAGCCG7651                           AlaArgProProLysProLysLysLysLysThrThrLysProLysPro                               657075                                                                         AAAACGCAGCCCAAGAAGATCAACGGAAAAACGCAGCAGCAAAAGAAG7699                           LysThrGlnProLysLysIleAsnGlyLysThrGlnGlnGlnLysLys                               808590                                                                         AAAGACAAGCAAGCCGACAAGAAGAAGAAGAAACCCGGAAAAAGAGAA7747                           LysAspLysGlnAlaAspLysLysLysLysLysProGlyLysArgGlu                               95100105                                                                       AGAATGTGCATGAAGATTGAAAATGACTGTATCTTCGAAGTCAAACAC7795                           ArgMetCysMetLysIleGluAsnAspCysIlePheGluValLysHis                               110115120125                                                                   GAAGGAAAGGTCACTGGGTACGCCTGCCTGGTGGGCGACAAAGTCATG7843                           GluGlyLysValThrGlyTyrAlaCysLeuValGlyAspLysValMet                               130135140                                                                      AAACCTGCCCACGTGAAAGGAGTCATCGACAACGCGGACCTGGCAAAG7891                           LysProAlaHisValLysGlyValIleAspAsnAlaAspLeuAlaLys                               145150155                                                                      CTAGCTTTCAAGAAATCGAGCAAGTATGACCTTGAGTGTGCCCAGATA7939                           LeuAlaPheLysLysSerSerLysTyrAspLeuGluCysAlaGlnIle                               160165170                                                                      CCAGTTCACATGAGGTCGGATGCCTCAAAGTACACGCATGAGAAGCCC7987                           ProValHisMetArgSerAspAlaSerLysTyrThrHisGluLysPro                               175180185                                                                      GAGGGACACTATAACTGGCACCACGGGGCTGTTCAGTACAGCGGAGGT8035                           GluGlyHisTyrAsnTrpHisHisGlyAlaValGlnTyrSerGlyGly                               190195200205                                                                   AGGTTCACTATACCGACAGGAGCGGGCAAACCGGGAGACAGTGGCCGG8083                           ArgPheThrIleProThrGlyAlaGlyLysProGlyAspSerGlyArg                               210215220                                                                      CCCATCTTTGACAACAAGGGGAGGGTAGTCGCTATCGTCCTGGGCGGG8131                           ProIlePheAspAsnLysGlyArgValValAlaIleValLeuGlyGly                               225230235                                                                      GCCAACGAGGGCTCACGCACAGCACTGTCGGTGGTCACCTGGAACAAA8179                           AlaAsnGluGlySerArgThrAlaLeuSerValValThrTrpAsnLys                               240245250                                                                      GATATGGTGACTAGAGTGACCCCCGAGGGGTCCGAAGAGTGGTCCGCC8227                           AspMetValThrArgValThrProGluGlySerGluGluTrpSerAla                               255260265                                                                      CCGCTGATTACTGCCATGTGTGTCCTTGCCAATGCTACCTTCCCGTGC8275                           ProLeuIleThrAlaMetCysValLeuAlaAsnAlaThrPheProCys                               270275280285                                                                   TTCCAGCCCCCGTGTGTACCTTGCTGCTATGAAAACAACGCAGAGGCC8323                           PheGlnProProCysValProCysCysTyrGluAsnAsnAlaGluAla                               290295300                                                                      ACACTACGGATGCTCGAGGATAACGTGGATAGGCCAGGGTACTACGAC8371                           ThrLeuArgMetLeuGluAspAsnValAspArgProGlyTyrTyrAsp                               305310315                                                                      CTCCTTCAGGCAGCCTTGACGTGCCGAAACGGAACAAGACACCGGCGC8419                           LeuLeuGlnAlaAlaLeuThrCysArgAsnGlyThrArgHisArgArg                               320325330                                                                      AGCGTGTCGCAACACTTCAACGTGTATAAGGCTACACGCCCTTACATC8467                           SerValSerGlnHisPheAsnValTyrLysAlaThrArgProTyrIle                               335340345                                                                      GCGTACTGCGCCGACTGCGGAGCAGGGCACTCGTGTCATAGCCCCGTA8515                           AlaTyrCysAlaAspCysGlyAlaGlyHisSerCysHisSerProVal                               350355360365                                                                   GCAATTGAAGCGGTCAGGTCCGAAGCTACCGACGGGATGCTGAAGATT8563                           AlaIleGluAlaValArgSerGluAlaThrAspGlyMetLeuLysIle                               370375380                                                                      CAGTTCTCGGCACAAATTGGCATAGATAAGAGTGACAATCATGACTAC8611                           GlnPheSerAlaGlnIleGlyIleAspLysSerAspAsnHisAspTyr                               385390395                                                                      ACGAAGATAAGGTACGCAGACGGGCACGCCATTGAGAATGCCGTCCGG8659                           ThrLysIleArgTyrAlaAspGlyHisAlaIleGluAsnAlaValArg                               400405410                                                                      TCATCTTTGAAGGTAGCCACCTCCGGAGACTGTTTCGTCCATGGCACA8707                           SerSerLeuLysValAlaThrSerGlyAspCysPheValHisGlyThr                               415420425                                                                      ATGGGACATTTCATACTGGCAAAGTGCCCACCGGGTGAATTCCTGCAG8755                           MetGlyHisPheIleLeuAlaLysCysProProGlyGluPheLeuGln                               430435440445                                                                   GTCTCGATCCAGGACACCAGAAACGCGGTCCGTGCCTGCAGAATACAA8803                           ValSerIleGlnAspThrArgAsnAlaValArgAlaCysArgIleGln                               450455460                                                                      TATCATCATGACCCTCAACCGGTGGGTAGAGAAAAATTTACAATTAGA8851                           TyrHisHisAspProGlnProValGlyArgGluLysPheThrIleArg                               465470475                                                                      CCACACTATGGAAAAGAGATCCCTTGCACCACTTATCAACAGACCACA8899                           ProHisTyrGlyLysGluIleProCysThrThrTyrGlnGlnThrThr                               480485490                                                                      GCGAAGACCGTGGAGGAAATCGACATGCATATGCCGCCAGATACGCCG8947                           AlaLysThrValGluGluIleAspMetHisMetProProAspThrPro                               495500505                                                                      GACAGGACGTTGCTATCACAGCAATCTGGCAATGTAAAGATCACAGTC8995                           AspArgThrLeuLeuSerGlnGlnSerGlyAsnValLysIleThrVal                               510515520525                                                                   GGAGGAAAGAAGGTGAAATACAACTGCACCTGTGGAACCGGAAACGTT9043                           GlyGlyLysLysValLysTyrAsnCysThrCysGlyThrGlyAsnVal                               530535540                                                                      GGCACTACTAATTCGGACATGACGATCAACACGTGTCTAATAGAGCAG9091                           GlyThrThrAsnSerAspMetThrIleAsnThrCysLeuIleGluGln                               545550555                                                                      TGCCACGTCTCAGTGACGGACCATAAGAAATGGCAGTTCAACTCACCT9139                           CysHisValSerValThrAspHisLysLysTrpGlnPheAsnSerPro                               560565570                                                                      TTCGTCCCGAGAGCCGACGAACCGGCTAGAAAAGGCAAAGTCCATATC9187                           PheValProArgAlaAspGluProAlaArgLysGlyLysValHisIle                               575580585                                                                      CCATTCCCGTTGGACAACATCACATGCAGAGTTCCAATGGCGCGCGAA9235                           ProPheProLeuAspAsnIleThrCysArgValProMetAlaArgGlu                               590595600605                                                                   CCAACCGTCATCCACGGCAAAAGAGAAGTGACACTGCACCTTCACCCA9283                           ProThrValIleHisGlyLysArgGluValThrLeuHisLeuHisPro                               610615620                                                                      GATCATCCCACGCTCTTTTCCTACCGCACACTGGGTGAGGACCCGCAG9331                           AspHisProThrLeuPheSerTyrArgThrLeuGlyGluAspProGln                               625630635                                                                      TATCACGAGGAATGGGTGACAGCGGCGGTGGAACGGACCATACCCGTA9379                           TyrHisGluGluTrpValThrAlaAlaValGluArgThrIleProVal                               640645650                                                                      CCAGTGGACGGGATGGAGTACCACTGGGGAAACAACGACCCAGTGAGG9427                           ProValAspGlyMetGluTyrHisTrpGlyAsnAsnAspProValArg                               655660665                                                                      CTTTGGTCTCAACTCACCACTGAAGGGAAACCGCACGGCTGGCCGCAT9475                           LeuTrpSerGlnLeuThrThrGluGlyLysProHisGlyTrpProHis                               670675680685                                                                   CAGATCGTACAGTACTACTATGGGCTTTACCCGGCCGCTACAGTATCC9523                           GlnIleValGlnTyrTyrTyrGlyLeuTyrProAlaAlaThrValSer                               690695700                                                                      GCGGTCGTCGGGATGAGCTTACTGGCGTTGATATCGATCTTCGCGTCG9571                           AlaValValGlyMetSerLeuLeuAlaLeuIleSerIlePheAlaSer                               705710715                                                                      TGCTACATGCTGGTTGCGGCCCGCAGTAAGTGCTTGACCCCTTATGCT9619                           CysTyrMetLeuValAlaAlaArgSerLysCysLeuThrProTyrAla                               720725730                                                                      TTAACACCAGGAGCTGCAGTTCCGTGGACGCTGGGGATACTCTGCTGC9667                           LeuThrProGlyAlaAlaValProTrpThrLeuGlyIleLeuCysCys                               735740745                                                                      GCCCCGCGGGCGCACGCAGCTAGTGTGGCAGAGACTATGGCCTACTTG9715                           AlaProArgAlaHisAlaAlaSerValAlaGluThrMetAlaTyrLeu                               750755760765                                                                   TGGGACCAAAACCAAGCGTTGTTCTGGTTGGAGTTTGCGGCCCCTGTT9763                           TrpAspGlnAsnGlnAlaLeuPheTrpLeuGluPheAlaAlaProVal                               770775780                                                                      GCCTGCATCCTCATCATCACGTATTGCCTCAGAAACGTGCTGTGTTGC9811                           AlaCysIleLeuIleIleThrTyrCysLeuArgAsnValLeuCysCys                               785790795                                                                      TGTAAGAGCCTTTCTTTTTTAGTGCTACTGAGCCTCGGGGCAACCGCC9859                           CysLysSerLeuSerPheLeuValLeuLeuSerLeuGlyAlaThrAla                               800805810                                                                      AGAGCTTACGAACATTCGACAGTAATGCCGAACGTGGTGGGGTTCCCG9907                           ArgAlaTyrGluHisSerThrValMetProAsnValValGlyPhePro                               815820825                                                                      TATAAGGCTCACATTGAAAGGCCAGGATATAGCCCCCTCACTTTGCAG9955                           TyrLysAlaHisIleGluArgProGlyTyrSerProLeuThrLeuGln                               830835840845                                                                   ATGCAGGTTGTTGAAACCAGCCTCGAACCAACCCTTAATTTGGAATAC10003                          MetGlnValValGluThrSerLeuGluProThrLeuAsnLeuGluTyr                               850855860                                                                      ATAACCTGTGAGTACAAGACGGTCGTCCCGTCGCCGTACGTGAAGTGC10051                          IleThrCysGluTyrLysThrValValProSerProTyrValLysCys                               865870875                                                                      TGCGGCGCCTCAGAGTGCTCCACTAAAGAGAAGCCTGACTACCAATGC10099                          CysGlyAlaSerGluCysSerThrLysGluLysProAspTyrGlnCys                               880885890                                                                      AAGGTTTACACAGGCGTGTACCCGTTCATGTGGGGAGGGGCATATTGC10147                          LysValTyrThrGlyValTyrProPheMetTrpGlyGlyAlaTyrCys                               895900905                                                                      TTCTGCGACTCAGAAAACACGCAACTCAGCGAGGCGTACGTCGATCGA10195                          PheCysAspSerGluAsnThrGlnLeuSerGluAlaTyrValAspArg                               910915920925                                                                   TCGGACGTATGCAGGCATGATCACGCATCTGCTTACAAAGCCCATACA10243                          SerAspValCysArgHisAspHisAlaSerAlaTyrLysAlaHisThr                               930935940                                                                      GCATCGCTGAAGGCCAAAGTGAGGGTTATGTACGGCAACGTAAACCAG10291                          AlaSerLeuLysAlaLysValArgValMetTyrGlyAsnValAsnGln                               945950955                                                                      ACTGTGGATGTTTACGTGAACGGAGACCATGCCGTCACGATAGGGGGT10339                          ThrValAspValTyrValAsnGlyAspHisAlaValThrIleGlyGly                               960965970                                                                      ACTCAGTTCATATTCGGGCCGCTGTCATCGGCCTGGACCCCGTTCGAC10387                          ThrGlnPheIlePheGlyProLeuSerSerAlaTrpThrProPheAsp                               975980985                                                                      AACAAGATAGTCGTGTACAAAGACGAAGTGTTCAATCAGGACTTCCCG10435                          AsnLysIleValValTyrLysAspGluValPheAsnGlnAspPhePro                               99099510001005                                                                 CCGTACGGATCTGGGCAACCAGGGCGCTTCGGCGACATCCAAAGCAGA10483                          ProTyrGlySerGlyGlnProGlyArgPheGlyAspIleGlnSerArg                               101010151020                                                                   ACAGTGGAGAGTAACGACCTGTACGCGAACACGGCACTGAAGCTGGCA10531                          ThrValGluSerAsnAspLeuTyrAlaAsnThrAlaLeuLysLeuAla                               102510301035                                                                   CGCCCTTCACCCGGCATGGTCCATGTACCGTACACACAGACACCTTCA10579                          ArgProSerProGlyMetValHisValProTyrThrGlnThrProSer                               104010451050                                                                   GGGTTCAAATATTGGCTAAAGGAAAAAGGGACAGCCCTAAATACGAAG10627                          GlyPheLysTyrTrpLeuLysGluLysGlyThrAlaLeuAsnThrLys                               105510601065                                                                   GCTCCTTTTGGCTGCCAAATCAAAACGAACCCTGTCAGGGCCATGAAC10675                          AlaProPheGlyCysGlnIleLysThrAsnProValArgAlaMetAsn                               1070107510801085                                                               TGCGCCGTGGGAAACATCCCTGTCTCCATGAATTTGCCTGACAGCGCC10723                          CysAlaValGlyAsnIleProValSerMetAsnLeuProAspSerAla                               109010951100                                                                   TTTACCCGCATTGTCGAGGCGCCGACCATCATTGACCTGACTTGCACA10771                          PheThrArgIleValGluAlaProThrIleIleAspLeuThrCysThr                               110511101115                                                                   GTGGCTACCTGTACGCACTCCTCGGATTTCGGCGGCGTCTTGACACTG10819                          ValAlaThrCysThrHisSerSerAspPheGlyGlyValLeuThrLeu                               112011251130                                                                   ACGTACAAGACCAACAAGAACGGGGACTGCTCTGTACACTCGCACTCT10867                          ThrTyrLysThrAsnLysAsnGlyAspCysSerValHisSerHisSer                               113511401145                                                                   AACGTAGCTACTCTACAGGAGGCCACAGCAAAAGTGAAGACAGCAGGT10915                          AsnValAlaThrLeuGlnGluAlaThrAlaLysValLysThrAlaGly                               1150115511601165                                                               AAGGTGACCTTACACTTCTCCACGGCAAGCGCATCACCTTCTTTTGTG10963                          LysValThrLeuHisPheSerThrAlaSerAlaSerProSerPheVal                               117011751180                                                                   GTGTCGCTATGCAGTGCTAGGGCCACCTGTTCAGCGTCGTGTGAGCCC11011                          ValSerLeuCysSerAlaArgAlaThrCysSerAlaSerCysGluPro                               118511901195                                                                   CCGAAAGACCACATAGTCCCATATGCGGCTAGCCACAGTAACGTAGTG11059                          ProLysAspHisIleValProTyrAlaAlaSerHisSerAsnValVal                               120012051210                                                                   TTTCCAGACATGTCGGGCACCGCACTATCATGGGTGCAGAAAATCTCG11107                          PheProAspMetSerGlyThrAlaLeuSerTrpValGlnLysIleSer                               121512201225                                                                   GGTGGTCTGGGGGCCTTCGCAATCGGCGCTATCCTGGTGCTGGTTGTG11155                          GlyGlyLeuGlyAlaPheAlaIleGlyAlaIleLeuValLeuValVal                               1230123512401245                                                               GTCACTTGCATTGGGCTCCGCAGATAAGTTAGGGTAGGCAATGGCATTGATATA11209                    ValThrCysIleGlyLeuArgArg                                                       1250                                                                           GCAAGAAAATTGAAAACAGAAAAAGTTAGGGTAAGCAATGGCATATAACCATAACTGTAT11269              AACTTGTAACAAAGCGCAACAAGACCTGCGCAATTGGCCCCGTGGTCCGCCTCACGGAAA11329              CTCGGGGCAACTCATATTGACACATTAATTGGCAATAATTGGAAGCTTACATAAGCTTAA11389              TTCGACGAATAATTGGATTTTTATTTTATTTTGCAATTGGTTTTTAATATTTCCAAAAAA11449              AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA11509              AAAACTAG11517                                                                  (2) INFORMATION FOR SEQ ID NO:2:                                               (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 2431 amino acids                                                   (B) TYPE: amino acid                                                           (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: protein                                                    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:2:                                        MetAlaAlaLysValHisValAspIleGluAlaAspSerProPheIle                               151015                                                                         LysSerLeuGlnLysAlaPheProSerPheGluValGluSerLeuGln                               202530                                                                         ValThrProAsnAspHisAlaAsnAlaArgAlaPheSerHisLeuAla                               354045                                                                         ThrLysLeuIleGluGlnGluThrAspLysAspThrLeuIleLeuAsp                               505560                                                                         IleGlySerAlaProSerArgArgMetMetSerThrHisLysTyrHis                               65707580                                                                       CysValCysProMetArgSerAlaGluAspProGluArgLeuAspSer                               859095                                                                         TyrAlaLysLysLeuAlaAlaAlaSerGlyLysValLeuAspArgGlu                               100105110                                                                      IleAlaGlyLysIleThrAspLeuGlnThrValMetAlaThrProAsp                               115120125                                                                      AlaGluSerProThrPheCysLeuHisThrAspValThrCysArgThr                               130135140                                                                      AlaAlaGluValAlaValTyrGlnAspValTyrAlaValHisAlaPro                               145150155160                                                                   ThrSerLeuTyrHisGlnAlaMetLysGlyValArgThrAlaTyrTrp                               165170175                                                                      IleGlyPheAspThrThrProPheMetPheAspAlaLeuAlaGlyAla                               180185190                                                                      TyrProThrTyrAlaThrAsnTrpAlaAspGluGlnValLeuGlnAla                               195200205                                                                      ArgAsnIleGlyLeuCysAlaAlaSerLeuThrGluGlyArgLeuGly                               210215220                                                                      LysLeuSerIleLeuArgLysLysGlnLeuLysProCysAspThrVal                               225230235240                                                                   MetPheSerValGlySerThrLeuTyrThrGluSerArgLysLeuLeu                               245250255                                                                      ArgSerTrpHisLeuProSerValPheHisLeuLysGlyLysGlnSer                               260265270                                                                      PheThrCysArgCysAspThrIleValSerCysGluGlyTyrValVal                               275280285                                                                      LysLysIleThrMetCysProGlyLeuTyrGlyLysThrValGlyTyr                               290295300                                                                      AlaValThrTyrHisAlaGluGlyPheLeuValCysLysThrThrAsp                               305310315320                                                                   ThrValLysGlyGluArgValSerPheProValCysThrTyrValPro                               325330335                                                                      SerThrIleCysAspGlnMetThrGlyIleLeuAlaThrAspValThr                               340345350                                                                      ProGluAspAlaGlnLysLeuLeuValGlyLeuAsnGlnArgIleVal                               355360365                                                                      ValAsnGlyArgThrGlnArgAsnThrAsnThrMetLysAsnTyrLeu                               370375380                                                                      LeuProIleValAlaValAlaPheSerLysTrpAlaArgGluTyrLys                               385390395400                                                                   AlaAspLeuAspAspGluLysProLeuGlyValArgGluArgSerLeu                               405410415                                                                      ThrCysCysCysLeuTrpAlaPheLysThrArgLysMetHisThrMet                               420425430                                                                      TyrLysLysProAspThrGlnThrIleValLysValProSerGluPhe                               435440445                                                                      AsnSerPheValIleProSerLeuTrpSerThrGlyLeuAlaIlePro                               450455460                                                                      ValArgSerArgIleLysMetLeuLeuAlaLysLysThrLysArgGlu                               465470475480                                                                   LeuIleProValLeuAspAlaSerSerAlaArgAspAlaGluGlnGlu                               485490495                                                                      GluLysGluArgLeuGluAlaGluLeuThrArgGluAlaLeuProPro                               500505510                                                                      LeuValProIleAlaProAlaGluThrGlyValValAspValAspVal                               515520525                                                                      GluGluLeuGluTyrHisAlaGlyAlaGlyValValGluThrProArg                               530535540                                                                      SerAlaLeuLysValThrAlaGlnProAsnAspValLeuLeuGlyAsn                               545550555560                                                                   TyrValValLeuSerProGlnThrValLeuLysSerSerLysLeuAla                               565570575                                                                      ProValHisProLeuAlaGluGlnValLysIleIleThrHisAsnGly                               580585590                                                                      ArgAlaGlyGlyTyrGlnValAspGlyTyrAspGlyArgValLeuLeu                               595600605                                                                      ProCysGlySerAlaIleProValProGluPheGlnAlaLeuSerGlu                               610615620                                                                      SerAlaThrMetValTyrAsnGluArgGluPheValAsnArgLysLeu                               625630635640                                                                   TyrHisIleAlaValHisGlyProSerLeuAsnThrAspGluGluAsn                               645650655                                                                      TyrGluLysValArgAlaGluArgThrAspAlaGluTyrValPheAsp                               660665670                                                                      ValAspLysLysCysCysValLysArgGluGluAlaSerGlyLeuVal                               675680685                                                                      LeuValGlyGluLeuThrAsnProProPheHisGluPheAlaTyrGlu                               690695700                                                                      GlyLeuLysIleArgProSerAlaProTyrLysThrThrValValGly                               705710715720                                                                   ValPheGlyValProGlySerGlyLysSerAlaIleIleLysSerLeu                               725730735                                                                      ValThrLysHisAspLeuValThrSerGlyLysLysGluAsnCysGln                               740745750                                                                      GluIleValAsnAspValLysLysHisArgGlyLysGlyThrSerArg                               755760765                                                                      GluAsnSerAspSerIleLeuLeuAsnGlyCysArgArgAlaValAsp                               770775780                                                                      IleLeuTyrValAspGluAlaPheAlaCysHisSerGlyThrLeuLeu                               785790795800                                                                   AlaLeuIleAlaLeuValLysProArgSerLysValValLeuCysGly                               805810815                                                                      AspProLysGlnCysGlyPhePheAsnMetMetGlnLeuLysValAsn                               820825830                                                                      PheAsnHisAsnIleCysThrGluValCysHisLysSerIleSerArg                               835840845                                                                      ArgCysThrArgProValThrAlaIleValSerThrLeuHisTyrGly                               850855860                                                                      GlyLysMetArgThrThrAsnProCysAsnLysProIleIleIleAsp                               865870875880                                                                   ThrThrGlyGlnThrLysProLysProGlyAspIleValLeuThrCys                               885890895                                                                      PheArgGlyTrpAlaLysGlnLeuGlnLeuAspTyrArgGlyHisGlu                               900905910                                                                      ValMetThrAlaAlaAlaSerGlnGlyLeuThrArgLysGlyValTyr                               915920925                                                                      AlaValArgGlnLysValAsnGluAsnProLeuTyrAlaProAlaSer                               930935940                                                                      GluHisValAsnValLeuLeuThrArgThrGluAspArgLeuValTrp                               945950955960                                                                   LysThrLeuAlaGlyAspProTrpIleLysValLeuSerAsnIlePro                               965970975                                                                      GlnGlyAsnPheThrAlaThrLeuGluGluTrpGlnGluGluHisAsp                               980985990                                                                      LysIleMetLysValIleGluGlyProAlaAlaProValAspAlaPhe                               99510001005                                                                    GlnAsnLysAlaAsnValCysTrpAlaLysSerLeuValProValLeu                               101010151020                                                                   AspThrAlaGlyIleArgLeuThrAlaGluGluTrpSerThrIleIle                               1025103010351040                                                               ThrAlaPheLysGluAspArgAlaTyrSerProValValAlaLeuAsn                               104510501055                                                                   GluIleCysThrLysTyrTyrGlyValAspLeuAspSerGlyLeuPhe                               106010651070                                                                   SerAlaProLysValSerLeuTyrTyrGluAsnAsnHisTrpAspAsn                               107510801085                                                                   ArgProGlyGlyArgMetTyrGlyPheAsnAlaAlaThrAlaAlaArg                               109010951100                                                                   LeuGluAlaArgHisThrPheLeuLysGlyGlnTrpHisThrGlyLys                               1105111011151120                                                               GlnAlaValIleAlaGluArgLysIleGlnProLeuSerValLeuAsp                               112511301135                                                                   AsnValIleProIleAsnArgArgLeuProHisAlaLeuValAlaGlu                               114011451150                                                                   TyrLysThrValLysGlySerArgValGluTrpLeuValAsnLysVal                               115511601165                                                                   ArgGlyTyrHisValLeuLeuValSerGluTyrAsnLeuAlaLeuPro                               117011751180                                                                   ArgArgArgValThrTrpLeuSerProLeuAsnValThrGlyAlaAsp                               1185119011951200                                                               ArgCysTyrAspLeuSerLeuGlyLeuProAlaAspAlaGlyArgPhe                               120512101215                                                                   AspLeuValPheValAsnIleHisThrGluPheArgIleHisHisTyr                               122012251230                                                                   GlnGlnCysValAspHisAlaMetLysLeuGlnMetLeuGlyGlyAsp                               123512401245                                                                   AlaLeuArgLeuLeuLysProGlyGlyIleLeuMetArgAlaTyrGly                               125012551260                                                                   TyrAlaAspLysIleSerGluAlaValValSerSerLeuSerArgLys                               1265127012751280                                                               PheSerSerAlaArgValLeuArgProAspCysValThrSerAsnThr                               128512901295                                                                   GluValPheLeuLeuPheSerAsnPheAspAsnGlyLysArgProSer                               130013051310                                                                   ThrLeuHisGlnMetAsnThrLysLeuSerAlaValTyrAlaGlyGlu                               131513201325                                                                   AlaMetHisThrAlaGlyCysAlaProSerTyrArgValLysArgAla                               133013351340                                                                   AspIleAlaThrCysThrGluAlaAlaValValAsnAlaAlaAsnAla                               1345135013551360                                                               ArgGlyThrValGlyAspGlyValCysArgAlaValAlaLysLysTrp                               136513701375                                                                   ProSerAlaPheLysGlyAlaAlaThrProValGlyThrIleLysThr                               138013851390                                                                   ValMetCysGlySerTyrProValIleHisAlaValAlaProAsnPhe                               139514001405                                                                   SerAlaThrThrGluAlaGluGlyAspArgGluLeuAlaAlaValTyr                               141014151420                                                                   ArgAlaValAlaAlaGluValAsnArgLeuSerLeuSerSerValAla                               1425143014351440                                                               IleProLeuLeuSerThrGlyValPheSerGlyGlyArgAspArgLeu                               144514501455                                                                   GlnGlnSerLeuAsnHisLeuPheThrAlaMetAspAlaThrAspAla                               146014651470                                                                   AspValThrIleTyrCysArgAspLysSerTrpGluLysLysIleGln                               147514801485                                                                   GluAlaIleAspMetArgThrAlaValGluLeuLeuAsnAspAspVal                               149014951500                                                                   GluLeuThrThrAspLeuValArgValHisProAspSerSerLeuVal                               1505151015151520                                                               GlyArgLysGlyTyrSerThrThrAspGlySerLeuTyrSerTyrPhe                               152515301535                                                                   GluGlyThrLysPheAsnGlnAlaAlaIleAspMetAlaGluIleLeu                               154015451550                                                                   ThrLeuTrpProArgLeuGlnGluAlaAsnGluGlnIleCysLeuTyr                               155515601565                                                                   AlaLeuGlyGluThrMetAspAsnIleArgSerLysCysProValAsn                               157015751580                                                                   AspSerAspSerSerThrProProArgThrValProCysLeuCysArg                               1585159015951600                                                               TyrAlaMetThrAlaGluArgIleAlaArgLeuArgSerHisGlnVal                               160516101615                                                                   LysSerMetValValCysSerSerPheProLeuProLysTyrHisVal                               162016251630                                                                   AspGlyValGlnLysValLysCysGluLysValLeuLeuPheAspPro                               163516401645                                                                   ThrValProSerValValSerProArgLysTyrAlaAlaSerThrThr                               165016551660                                                                   AspHisSerAspArgSerLeuArgGlyPheAspLeuAspTrpThrThr                               1665167016751680                                                               AspSerSerSerThrAlaSerAspThrMetSerLeuProSerLeuGln                               168516901695                                                                   SerCysAspIleAspSerIleTyrGluProMetAlaProIleValVal                               170017051710                                                                   ThrAlaAspValHisProGluProAlaGlyIleAlaAspLeuAlaAla                               171517201725                                                                   AspValHisProGluProAlaAspHisValAspLeuGluAsnProIle                               173017351740                                                                   ProProProArgProLysArgAlaAlaTyrLeuAlaSerArgAlaAla                               1745175017551760                                                               GluArgProValProAlaProArgLysProThrProAlaProArgThr                               176517701775                                                                   AlaPheArgAsnLysLeuProLeuThrPheGlyAspPheAspGluHis                               178017851790                                                                   GluValAspAlaLeuAlaSerGlyIleThrPheGlyAspPheAspAsp                               179518001805                                                                   ValLeuArgLeuGlyArgAlaGlyAlaTyrIlePheSerSerAspThr                               181018151820                                                                   GlySerGlyHisLeuGlnGlnLysSerValArgGlnHisAsnLeuGln                               1825183018351840                                                               CysAlaGlnLeuAspAlaValGlnGluGluLysMetTyrProProLys                               184518501855                                                                   LeuAspThrGluArgGluLysLeuLeuLeuLeuLysMetGlnMetHis                               186018651870                                                                   ProSerGluAlaAsnLysSerArgTyrGlnSerArgLysValGluAsn                               187518801885                                                                   MetLysAlaThrValValAspArgLeuThrSerGlyAlaArgLeuTyr                               189018951900                                                                   ThrGlyAlaAspValGlyArgIleProThrTyrAlaValArgTyrPro                               1905191019151920                                                               ArgProValTyrSerProThrValIleGluArgPheSerSerProAsp                               192519301935                                                                   ValAlaIleAlaAlaCysAsnGluTyrLeuSerArgAsnTyrProThr                               194019451950                                                                   ValAlaSerTyrGlnIleThrAspGluTyrAspAlaTyrLeuAspMet                               195519601965                                                                   ValAspGlySerAspSerCysLeuAspArgAlaThrPheCysProAla                               197019751980                                                                   LysLeuArgCysTyrProLysHisHisAlaTyrHisGlnProThrVal                               1985199019952000                                                               ArgSerAlaValProSerProPheGlnAsnThrLeuGlnAsnValLeu                               200520102015                                                                   AlaAlaAlaThrLysArgAsnCysAsnValThrGlnMetArgGluLeu                               202020252030                                                                   ProThrMetAspSerAlaValPheAsnValGluCysPheLysArgTyr                               203520402045                                                                   AlaCysSerGlyGluTyrTrpGluGluTyrAlaLysGlnProIleArg                               205020552060                                                                   IleThrThrGluAsnIleThrThrTyrValThrLysLeuLysGlyPro                               2065207020752080                                                               LysAlaAlaAlaLeuPheAlaLysThrHisAsnLeuValProLeuGln                               208520902095                                                                   GluValProMetAspArgPheThrValAspMetLysArgAspValLys                               210021052110                                                                   ValThrProGlyThrLysHisThrGluGluArgProLysValGlnVal                               211521202125                                                                   IleGlnAlaAlaGluProLeuAlaThrAlaTyrLeuCysGlyIleHis                               213021352140                                                                   ArgGluLeuValArgArgLeuAsnAlaValLeuArgProAsnValHis                               2145215021552160                                                               ThrLeuPheAspMetSerAlaGluAspPheAspAlaIleIleAlaSer                               216521702175                                                                   HisPheHisProGlyAspProValLeuGluThrAspIleAlaSerPhe                               218021852190                                                                   AspLysSerGlnAspAspSerLeuAlaLeuThrGlyLeuMetIleLeu                               219522002205                                                                   GluAspLeuGlyValAspGlnTyrLeuLeuAspLeuIleGluAlaAla                               221022152220                                                                   PheGlyGluIleSerSerCysHisLeuProThrGlyThrArgPheLys                               2225223022352240                                                               PheGlyAlaMetMetLysSerGlyMetPheLeuThrLeuPheIleAsn                               224522502255                                                                   ThrValLeuAsnIleThrIleAlaSerArgValLeuGluGlnArgLeu                               226022652270                                                                   ThrAspSerAlaCysAlaAlaPheIleGlyAspAspAsnIleValHis                               227522802285                                                                   GlyValIleSerAspLysLeuMetAlaGluArgCysAlaSerTrpVal                               229022952300                                                                   AsnMetGluValLysIleIleAspAlaValMetGlyGluLysProPro                               2305231023152320                                                               TyrPheCysGlyGlyPheIleValPheAspSerValThrGlnThrAla                               232523302335                                                                   CysArgValSerAspProLeuLysArgLeuPheLysLeuGlyLysPro                               234023452350                                                                   LeuThrAlaGluAspLysGlnAspGluAspArgArgArgAlaLeuSer                               235523602365                                                                   AspGluValSerLysTrpPheArgThrGlyLeuGlyAlaGluLeuGlu                               237023752380                                                                   ValAlaLeuThrSerArgTyrGluValGluGlyCysLysSerIleLeu                               2385239023952400                                                               IleAlaMetThrThrLeuAlaArgAspIleLysAlaPheLysLysLeu                               240524102415                                                                   ArgGlyProValIleHisLeuTyrGlyGlyProArgLeuValArg                                  242024252430                                                                   (2) INFORMATION FOR SEQ ID NO:3:                                               (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 1253 amino acids                                                   (B) TYPE: amino acid                                                           (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: protein                                                    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:3:                                        MetAsnTyrIleProThrGlnThrPheTyrGlyArgArgTrpArgPro                               151015                                                                         ArgProAlaAlaArgProTrpProLeuGlnAlaThrProValAlaPro                               202530                                                                         ValValProAspPheGlnAlaGlnGlnMetGlnGlnLeuIleSerAla                               354045                                                                         ValAsnAlaLeuThrMetArgGlnAsnAlaIleAlaProAlaArgPro                               505560                                                                         ProLysProLysLysLysLysThrThrLysProLysProLysThrGln                               65707580                                                                       ProLysLysIleAsnGlyLysThrGlnGlnGlnLysLysLysAspLys                               859095                                                                         GlnAlaAspLysLysLysLysLysProGlyLysArgGluArgMetCys                               100105110                                                                      MetLysIleGluAsnAspCysIlePheGluValLysHisGluGlyLys                               115120125                                                                      ValThrGlyTyrAlaCysLeuValGlyAspLysValMetLysProAla                               130135140                                                                      HisValLysGlyValIleAspAsnAlaAspLeuAlaLysLeuAlaPhe                               145150155160                                                                   LysLysSerSerLysTyrAspLeuGluCysAlaGlnIleProValHis                               165170175                                                                      MetArgSerAspAlaSerLysTyrThrHisGluLysProGluGlyHis                               180185190                                                                      TyrAsnTrpHisHisGlyAlaValGlnTyrSerGlyGlyArgPheThr                               195200205                                                                      IleProThrGlyAlaGlyLysProGlyAspSerGlyArgProIlePhe                               210215220                                                                      AspAsnLysGlyArgValValAlaIleValLeuGlyGlyAlaAsnGlu                               225230235240                                                                   GlySerArgThrAlaLeuSerValValThrTrpAsnLysAspMetVal                               245250255                                                                      ThrArgValThrProGluGlySerGluGluTrpSerAlaProLeuIle                               260265270                                                                      ThrAlaMetCysValLeuAlaAsnAlaThrPheProCysPheGlnPro                               275280285                                                                      ProCysValProCysCysTyrGluAsnAsnAlaGluAlaThrLeuArg                               290295300                                                                      MetLeuGluAspAsnValAspArgProGlyTyrTyrAspLeuLeuGln                               305310315320                                                                   AlaAlaLeuThrCysArgAsnGlyThrArgHisArgArgSerValSer                               325330335                                                                      GlnHisPheAsnValTyrLysAlaThrArgProTyrIleAlaTyrCys                               340345350                                                                      AlaAspCysGlyAlaGlyHisSerCysHisSerProValAlaIleGlu                               355360365                                                                      AlaValArgSerGluAlaThrAspGlyMetLeuLysIleGlnPheSer                               370375380                                                                      AlaGlnIleGlyIleAspLysSerAspAsnHisAspTyrThrLysIle                               385390395400                                                                   ArgTyrAlaAspGlyHisAlaIleGluAsnAlaValArgSerSerLeu                               405410415                                                                      LysValAlaThrSerGlyAspCysPheValHisGlyThrMetGlyHis                               420425430                                                                      PheIleLeuAlaLysCysProProGlyGluPheLeuGlnValSerIle                               435440445                                                                      GlnAspThrArgAsnAlaValArgAlaCysArgIleGlnTyrHisHis                               450455460                                                                      AspProGlnProValGlyArgGluLysPheThrIleArgProHisTyr                               465470475480                                                                   GlyLysGluIleProCysThrThrTyrGlnGlnThrThrAlaLysThr                               485490495                                                                      ValGluGluIleAspMetHisMetProProAspThrProAspArgThr                               500505510                                                                      LeuLeuSerGlnGlnSerGlyAsnValLysIleThrValGlyGlyLys                               515520525                                                                      LysValLysTyrAsnCysThrCysGlyThrGlyAsnValGlyThrThr                               530535540                                                                      AsnSerAspMetThrIleAsnThrCysLeuIleGluGlnCysHisVal                               545550555560                                                                   SerValThrAspHisLysLysTrpGlnPheAsnSerProPheValPro                               565570575                                                                      ArgAlaAspGluProAlaArgLysGlyLysValHisIleProPhePro                               580585590                                                                      LeuAspAsnIleThrCysArgValProMetAlaArgGluProThrVal                               595600605                                                                      IleHisGlyLysArgGluValThrLeuHisLeuHisProAspHisPro                               610615620                                                                      ThrLeuPheSerTyrArgThrLeuGlyGluAspProGlnTyrHisGlu                               625630635640                                                                   GluTrpValThrAlaAlaValGluArgThrIleProValProValAsp                               645650655                                                                      GlyMetGluTyrHisTrpGlyAsnAsnAspProValArgLeuTrpSer                               660665670                                                                      GlnLeuThrThrGluGlyLysProHisGlyTrpProHisGlnIleVal                               675680685                                                                      GlnTyrTyrTyrGlyLeuTyrProAlaAlaThrValSerAlaValVal                               690695700                                                                      GlyMetSerLeuLeuAlaLeuIleSerIlePheAlaSerCysTyrMet                               705710715720                                                                   LeuValAlaAlaArgSerLysCysLeuThrProTyrAlaLeuThrPro                               725730735                                                                      GlyAlaAlaValProTrpThrLeuGlyIleLeuCysCysAlaProArg                               740745750                                                                      AlaHisAlaAlaSerValAlaGluThrMetAlaTyrLeuTrpAspGln                               755760765                                                                      AsnGlnAlaLeuPheTrpLeuGluPheAlaAlaProValAlaCysIle                               770775780                                                                      LeuIleIleThrTyrCysLeuArgAsnValLeuCysCysCysLysSer                               785790795800                                                                   LeuSerPheLeuValLeuLeuSerLeuGlyAlaThrAlaArgAlaTyr                               805810815                                                                      GluHisSerThrValMetProAsnValValGlyPheProTyrLysAla                               820825830                                                                      HisIleGluArgProGlyTyrSerProLeuThrLeuGlnMetGlnVal                               835840845                                                                      ValGluThrSerLeuGluProThrLeuAsnLeuGluTyrIleThrCys                               850855860                                                                      GluTyrLysThrValValProSerProTyrValLysCysCysGlyAla                               865870875880                                                                   SerGluCysSerThrLysGluLysProAspTyrGlnCysLysValTyr                               885890895                                                                      ThrGlyValTyrProPheMetTrpGlyGlyAlaTyrCysPheCysAsp                               900905910                                                                      SerGluAsnThrGlnLeuSerGluAlaTyrValAspArgSerAspVal                               915920925                                                                      CysArgHisAspHisAlaSerAlaTyrLysAlaHisThrAlaSerLeu                               930935940                                                                      LysAlaLysValArgValMetTyrGlyAsnValAsnGlnThrValAsp                               945950955960                                                                   ValTyrValAsnGlyAspHisAlaValThrIleGlyGlyThrGlnPhe                               965970975                                                                      IlePheGlyProLeuSerSerAlaTrpThrProPheAspAsnLysIle                               980985990                                                                      ValValTyrLysAspGluValPheAsnGlnAspPheProProTyrGly                               99510001005                                                                    SerGlyGlnProGlyArgPheGlyAspIleGlnSerArgThrValGlu                               101010151020                                                                   SerAsnAspLeuTyrAlaAsnThrAlaLeuLysLeuAlaArgProSer                               1025103010351040                                                               ProGlyMetValHisValProTyrThrGlnThrProSerGlyPheLys                               104510501055                                                                   TyrTrpLeuLysGluLysGlyThrAlaLeuAsnThrLysAlaProPhe                               106010651070                                                                   GlyCysGlnIleLysThrAsnProValArgAlaMetAsnCysAlaVal                               107510801085                                                                   GlyAsnIleProValSerMetAsnLeuProAspSerAlaPheThrArg                               109010951100                                                                   IleValGluAlaProThrIleIleAspLeuThrCysThrValAlaThr                               1105111011151120                                                               CysThrHisSerSerAspPheGlyGlyValLeuThrLeuThrTyrLys                               112511301135                                                                   ThrAsnLysAsnGlyAspCysSerValHisSerHisSerAsnValAla                               114011451150                                                                   ThrLeuGlnGluAlaThrAlaLysValLysThrAlaGlyLysValThr                               115511601165                                                                   LeuHisPheSerThrAlaSerAlaSerProSerPheValValSerLeu                               117011751180                                                                   CysSerAlaArgAlaThrCysSerAlaSerCysGluProProLysAsp                               1185119011951200                                                               HisIleValProTyrAlaAlaSerHisSerAsnValValPheProAsp                               120512101215                                                                   MetSerGlyThrAlaLeuSerTrpValGlnLysIleSerGlyGlyLeu                               122012251230                                                                   GlyAlaPheAlaIleGlyAlaIleLeuValLeuValValValThrCys                               123512401245                                                                   IleGlyLeuArgArg                                                                1250                                                                           (2) INFORMATION FOR SEQ ID NO:4:                                               (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 115 base pairs                                                     (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: RNA (genomic)                                              (iii) HYPOTHETICAL: NO                                                         (iv) ANTI-SENSE: NO                                                            (ix) FEATURE:                                                                  (A) NAME/KEY: -                                                                (B) LOCATION: 1..115                                                           (D) OTHER INFORMATION: /label=26S_region                                       /note="26S promoter and transcription start and                                proximal downstream region of pSFV1; Figure 8."                                (ix) FEATURE:                                                                  (A) NAME/KEY: misc_feature                                                     (B) LOCATION: 1..24                                                            (D) OTHER INFORMATION: /product="26S promoter region"                          (xi) SEQUENCE DESCRIPTION: SEQ ID NO:4:                                        ACCTCTACGGCGGTCCTAGATTGGTGCGTTAATACACAGAATCTGATTGGATCCCGGGTA60                 ATTAATTGAATTACATCCCTACGCAAACGTTTTACGGCCGCCGGTGGCGCCCGCG115                     (2) INFORMATION FOR SEQ ID NO:5:                                               (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 127 base pairs                                                     (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: RNA (genomic)                                              (iii) HYPOTHETICAL: NO                                                         (iv) ANTI-SENSE: NO                                                            (ix) FEATURE:                                                                  (A) NAME/KEY: -                                                                (B) LOCATION: 1..127                                                           (D) OTHER INFORMATION: /label=26S_region                                       /note="26S promoter and transcription start and                                proximal downstream region of pSFV2; Figure 8."                                (ix) FEATURE:                                                                  (A) NAME/KEY: misc_feature                                                     (B) LOCATION: 1..24                                                            (D) OTHER INFORMATION: /product="26S promoter region"                          (xi) SEQUENCE DESCRIPTION: SEQ ID NO:5:                                        ACCTCTACGGCGGTCCTAGATTGGTGCGTTAATACACAGAATTCTGATTATAGCGCACTA60                 TTATATAGCACCGGATCCCGGGTAATTAATTGACGCAAACGTTTTACGGCCGCCGGTGGC120                GCCCGCG127                                                                     (2) INFORMATION FOR SEQ ID NO:6:                                               (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 123 base pairs                                                     (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: RNA (genomic)                                              (iii) HYPOTHETICAL: NO                                                         (iv) ANTI-SENSE: NO                                                            (ix) FEATURE:                                                                  (A) NAME/KEY: -                                                                (B) LOCATION: 1..123                                                           (D) OTHER INFORMATION: /label=26S_region                                       /note="26S promoter and transcription start and                                proximal downstream region of pSFV3; Figure 8."                                (ix) FEATURE:                                                                  (A) NAME/KEY: misc_feature                                                     (B) LOCATION: 1..24                                                            (D) OTHER INFORMATION: /product="26S promoter region"                          (xi) SEQUENCE DESCRIPTION: SEQ ID NO:6:                                        ACCTCTACGGCGGTCCTAGATTGGTGCGTTAATACACAGAATTCTGATTATAGCGCACTA60                 TTATATAGCACCATGGATCCCGGGTAATTAATTGACGTTTTACGGCCGCCGGTGGCGCCC120                GCG123                                                                         (2) INFORMATION FOR SEQ ID NO:7:                                               (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 54 base pairs                                                      (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: RNA (genomic)                                              (iii) HYPOTHETICAL: NO                                                         (vi) ORIGINAL SOURCE:                                                          (A) ORGANISM: Semliki Forest Virus                                             (ix) FEATURE:                                                                  (A) NAME/KEY: -                                                                (B) LOCATION: 1..54                                                            (D) OTHER INFORMATION: /label=restrict_site                                    /note="sequence of SFV E2 genome in vicinity of Bam HI s                       vector E2; Figure 12."                                                         (ix) FEATURE:                                                                  (A) NAME/KEY: CDS                                                              (B) LOCATION: 1..54                                                            (xi) SEQUENCE DESCRIPTION: SEQ ID NO:7:                                        AACTCACCTTTCGTCCCGAGAGCCGACGAACCGGCTAGAAAAGGCAAA48                             AsnSerProPheValProArgAlaAspGluProAlaArgLysGlyLys                               151015                                                                         GTCCAT54                                                                       ValHis                                                                         (2) INFORMATION FOR SEQ ID NO:8:                                               (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 18 amino acids                                                     (B) TYPE: amino acid                                                           (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: protein                                                    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:8:                                        AsnSerProPheValProArgAlaAspGluProAlaArgLysGlyLys                               151015                                                                         ValHis                                                                         (2) INFORMATION FOR SEQ ID NO:9:                                               (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 46 base pairs                                                      (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: double                                                       (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: DNA (genomic)                                              (iii) HYPOTHETICAL: NO                                                         (iv) ANTI-SENSE: NO                                                            (vi) ORIGINAL SOURCE:                                                          (A) ORGANISM: HIV                                                              (ix) FEATURE:                                                                  (A) NAME/KEY: -                                                                (B) LOCATION: 1..46                                                            (D) OTHER INFORMATION: /label=fragment                                         /note="HIV gp120 epitope introduced into SFV                                   vector E2; Figure 12."                                                         (ix) FEATURE:                                                                  (A) NAME/KEY: CDS                                                              (B) LOCATION: 1..45                                                            (xi) SEQUENCE DESCRIPTION: SEQ ID NO:9:                                        GATCCGCGTATCCAGAGAGGACCAGGAAGAGCATTTGTTGAGCTA45                                AspProArgIleGlnArgGlyProGlyArgAlaPheValGluLeu                                  151015                                                                         G46                                                                            (2) INFORMATION FOR SEQ ID NO:10:                                              (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 15 amino acids                                                     (B) TYPE: amino acid                                                           (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: protein                                                    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:10:                                       AspProArgIleGlnArgGlyProGlyArgAlaPheValGluLeu                                  151015                                                                         (2) INFORMATION FOR SEQ ID NO:11:                                              (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 51 base pairs                                                      (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: double                                                       (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: DNA (genomic)                                              (iii) HYPOTHETICAL: NO                                                         (ix) FEATURE:                                                                  (A) NAME/KEY: -                                                                (B) LOCATION: 1..51                                                            (D) OTHER INFORMATION: /label=chimaeric_seq                                    /note="SFV-HIV chimaeric sequence shown in Figure                              12."                                                                           (ix) FEATURE:                                                                  (A) NAME/KEY: CDS                                                              (B) LOCATION: 1..51                                                            (D) OTHER INFORMATION: /product="SFV-HIV chimaeric                             sequence"                                                                      (xi) SEQUENCE DESCRIPTION: SEQ ID NO:11:                                       GAGGATCCGCGTATCCAGAGAGGACCAGGAAGAGCATTTGTTGAGGAT48                             GluAspProArgIleGlnArgGlyProGlyArgAlaPheValGluAsp                               151015                                                                         CCG51                                                                          Pro                                                                            (2) INFORMATION FOR SEQ ID NO:12:                                              (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 17 amino acids                                                     (B) TYPE: amino acid                                                           (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: protein                                                    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:12:                                       GluAspProArgIleGlnArgGlyProGlyArgAlaPheValGluAsp                               151015                                                                         Pro                                                                            (2) INFORMATION FOR SEQ ID NO:13:                                              (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 60 base pairs                                                      (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: DNA (genomic)                                              (iii) HYPOTHETICAL: NO                                                         (iv) ANTI-SENSE: NO                                                            (ix) FEATURE:                                                                  (A) NAME/KEY: -                                                                (B) LOCATION: 1..60                                                            (D) OTHER INFORMATION: /label=oligonucleotide                                  /note="used to introduce new linker site"                                      (xi) SEQUENCE DESCRIPTION: SEQ ID NO:13:                                       CGGCCAGTGAATTCTGATTGGATCCCGGGTAATTAATTGAATTACATCCCTACGCAAACG60                 (2) INFORMATION FOR SEQ ID NO:14:                                              (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 62 base pairs                                                      (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: DNA (genomic)                                              (iii) HYPOTHETICAL: NO                                                         (iv) ANTI-SENSE: NO                                                            (ix) FEATURE:                                                                  (A) NAME/KEY: -                                                                (B) LOCATION: 1..62                                                            (D) OTHER INFORMATION: /label=oligonucleotide                                  /note="used to introduce new linker site"                                      (xi) SEQUENCE DESCRIPTION: SEQ ID NO:14:                                       GCGCACTATTATAGCACCGGCTCCCGGGTAATTAATTGACGCAAACGTTTTACGGCCGCC60                 GG62                                                                           (2) INFORMATION FOR SEQ ID NO:15:                                              (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 62 base pairs                                                      (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: DNA (genomic)                                              (iii) HYPOTHETICAL: NO                                                         (iv) ANTI-SENSE: NO                                                            (ix) FEATURE:                                                                  (A) NAME/KEY: -                                                                (B) LOCATION: 1..62                                                            (D) OTHER INFORMATION: /label=oligonucleotide                                  /note="used to introduce new linker site"                                      (xi) SEQUENCE DESCRIPTION: SEQ ID NO:15:                                       GCGCACTATTATAGCACCATGGATCCGGGTAATTAATTGACGTTTTACGGCCGCCGGTGG60                 CG62                                                                           (2) INFORMATION FOR SEQ ID NO:16:                                              (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 21 base pairs                                                      (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: DNA (genomic)                                              (iii) HYPOTHETICAL: NO                                                         (iv) ANTI-SENSE: NO                                                            (ix) FEATURE:                                                                  (A) NAME/KEY: -                                                                (B) LOCATION: 1..21                                                            (D) OTHER INFORMATION: /label=primer                                           /note="SP1 upstream sequencing primer"                                         (xi) SEQUENCE DESCRIPTION: SEQ ID NO:16:                                       CGGCGGTCCTAGATTGGTGCG21                                                        (2) INFORMATION FOR SEQ ID NO:17:                                              (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 21 base pairs                                                      (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: DNA (genomic)                                              (iii) HYPOTHETICAL: NO                                                         (iv) ANTI-SENSE: YES                                                           (ix) FEATURE:                                                                  (A) NAME/KEY: -                                                                (B) LOCATION: 1..21                                                            (D) OTHER INFORMATION: /label=primer                                           /note="SP2 downstream sequencing primer"                                       (xi) SEQUENCE DESCRIPTION: SEQ ID NO:17:                                       CGCGGGCGCCACCGGCGGCCG21                                                        (2) INFORMATION FOR SEQ ID NO:18:                                              (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 21 base pairs                                                      (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: DNA (genomic)                                              (iii) HYPOTHETICAL: NO                                                         (iv) ANTI-SENSE: YES                                                           (ix) FEATURE:                                                                  (A) NAME/KEY: -                                                                (B) LOCATION: 1..21                                                            (D) OTHER INFORMATION: /label=primer                                           /note="primer-1 for first strand cDNA synthesis"                               (xi) SEQUENCE DESCRIPTION: SEQ ID NO:18:                                       TTTCTCGTAGTTCTCCTCGTC21                                                        (2) INFORMATION FOR SEQ ID NO:19:                                              (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 27 base pairs                                                      (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: DNA (genomic)                                              (iii) HYPOTHETICAL: NO                                                         (iv) ANTI-SENSE: YES                                                           (ix) FEATURE:                                                                  (A) NAME/KEY: -                                                                (B) LOCATION: 1..27                                                            (D) OTHER INFORMATION: /label=primer                                           /note="primer-2 for first strand cDNA synthesis"                               (xi) SEQUENCE DESCRIPTION: SEQ ID NO:19:                                       GTTATCCCAGTGGTTGTTCTCGTAATA27                                                  (2) INFORMATION FOR SEQ ID NO:20:                                              (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 28 base pairs                                                      (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: DNA (genomic)                                              (iii) HYPOTHETICAL: NO                                                         (iv) ANTI-SENSE: NO                                                            (ix) FEATURE:                                                                  (A) NAME/KEY: -                                                                (B) LOCATION: 1..28                                                            (D) OTHER INFORMATION: /label=primer                                           /note="5'most primer for second strand cDNA                                    synthesis, equals bp 1-28 of SFV sequence"                                     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:20:                                       ATGGCGGATGTGTGACATACACGACGCC28                                                 (2) INFORMATION FOR SEQ ID NO:21:                                              (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 46 base pairs                                                      (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: double                                                       (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: DNA (genomic)                                              (iii) HYPOTHETICAL: NO                                                         (ix) FEATURE:                                                                  (A) NAME/KEY: -                                                                (B) LOCATION: 1..46                                                            (D) OTHER INFORMATION: /label=adaptor                                          /note="5'-sticky end                                                           (EcoRI-HindIII-NotI-XmaIII-SpeI) blunt end-3'                                  adaptor"                                                                       (xi) SEQUENCE DESCRIPTION: SEQ ID NO:21:                                       AATTCAAGCTTGCGGCCGCACTAGTGTTCGAACGCCGGCGTGATCA46                               (2) INFORMATION FOR SEQ ID NO:22:                                              (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 8 base pairs                                                       (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: DNA (genomic)                                              (iii) HYPOTHETICAL: NO                                                         (iv) ANTI-SENSE: NO                                                            (ix) FEATURE:                                                                  (A) NAME/KEY: -                                                                (B) LOCATION: 1..8                                                             (D) OTHER INFORMATION: /label=oligonucleotide                                  /note="NcoI oligonucleotide"                                                   (xi) SEQUENCE DESCRIPTION: SEQ ID NO:22:                                       GCCATGGC8                                                                      (2) INFORMATION FOR SEQ ID NO:23:                                              (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 20 base pairs                                                      (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: DNA (genomic)                                              (iii) HYPOTHETICAL: NO                                                         (iv) ANTI-SENSE: NO                                                            (ix) FEATURE:                                                                  (A) NAME/KEY: -                                                                (B) LOCATION: 1..20                                                            (D) OTHER INFORMATION: /label=oligonucleotide                                  /note="oligonucleotide used for screening by                                   colony hybridization"                                                          (xi) SEQUENCE DESCRIPTION: SEQ ID NO:23:                                       GGTGACACTATAGCCATGGC20                                                         (2) INFORMATION FOR SEQ ID NO:24:                                              (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 24 base pairs                                                      (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: DNA (genomic)                                              (iii) HYPOTHETICAL: NO                                                         (iv) ANTI-SENSE: NO                                                            (ix) FEATURE:                                                                  (A) NAME/KEY: -                                                                (B) LOCATION: 1..24                                                            (D) OTHER INFORMATION: /label=oligonucleotide                                  /note="site-directed mutagenic oligonucleotide                                 used to introduce a BamHI site into the SFV                                    genome"                                                                        (xi) SEQUENCE DESCRIPTION: SEQ ID NO:24:                                       GATCGGCCTAGGAGCCGAGAGCCC24                                                     (2) INFORMATION FOR SEQ ID NO:25:                                              (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 80 base pairs                                                      (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: RNA (genomic)                                              (iii) HYPOTHETICAL: NO                                                         (iv) ANTI-SENSE: NO                                                            (vi) ORIGINAL SOURCE:                                                          (A) ORGANISM: Semliki Forest Virus                                             (ix) FEATURE:                                                                  (A) NAME/KEY: -                                                                (B) LOCATION: 1..80                                                            (D) OTHER INFORMATION: /label=terminator                                       /note="3'terminal sequence of cDNA expression                                  vector complementary to alphavirus genomic RNA"                                (xi) SEQUENCE DESCRIPTION: SEQ ID NO:25:                                       TTTCCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA60                 AAAAAAAAAAAAAAACTAGT80                                                         (2) INFORMATION FOR SEQ ID NO:26:                                              (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 54 base pairs                                                      (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: RNA (genomic)                                              (iii) HYPOTHETICAL: NO                                                         (vi) ORIGINAL SOURCE:                                                          (A) ORGANISM: Semliki Forest Virus                                             (ix) FEATURE:                                                                  (A) NAME/KEY: -                                                                (B) LOCATION: 1..54                                                            (D) OTHER INFORMATION: /label=restrict_site                                    /note="sequence of SFV vector E2 in vicinity of Bam HI                         site; 12."                                                                     (ix) FEATURE:                                                                  (A) NAME/KEY: mutation                                                         (B) LOCATION: 27..32                                                           (D) OTHER INFORMATION: /label=restriction_sit                                  /note="BamHI recognition sequence introduced into                              SFV E2 genome in SFV vector E2."                                               (ix) FEATURE:                                                                  (A) NAME/KEY: CDS                                                              (B) LOCATION: 1..54                                                            (xi) SEQUENCE DESCRIPTION: SEQ ID NO:26:                                       AACTCACCTTTCGTCCCGAGAGCCGAGGATCCGGCTAGAAAAGGCAAA48                             AsnSerProPheValProArgAlaGluAspProAlaArgLysGlyLys                               151015                                                                         GTCCAT54                                                                       ValHis                                                                         (2) INFORMATION FOR SEQ ID NO:27:                                              (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 18 amino acids                                                     (B) TYPE: amino acid                                                           (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: protein                                                    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:27:                                       AsnSerProPheValProArgAlaGluAspProAlaArgLysGlyLys                               151015                                                                         ValHis                                                                         __________________________________________________________________________ 

We claim:
 1. A recombinant RNA molecule which can be efficiently translated and replicated in an animal host cell, comprising a Semliki Forest Virus RNA genome and an exogenous RNA sequence, wherein said Semliki Forest Virus RNA genome contains at least one deletion or stop codon mutation such that at least one structural protein of the Semliki Forest Virus cannot be made upon introduction of said recombinant RNA into said host cell, and further wherein said exogenous RNA sequence is operatively inserted into a region of the Semliki Forest Virus RNA genome which is non-essential to replication of the recombinant RNA molecule such that the exogenous RNA is expressed from a Semliki Forest Virus transcriptional promoter when the recombinant RNA is introduced into a host cell and further such that the exogenous RNA expresses its biological function in said host cell.
 2. The recombinant RNA of claim 1, wherein the exogenous RNA sequence encodes a protein, a polypeptide or a peptide sequence defining an exogenous antigenic epitope or determinant.
 3. The recombinant RNA of claim 1, wherein the Semliki Forest Virus genome RNA comprises a 5'-terminal portion, at least one region coding for non-structural proteins required for replication of the Semliki Forest Virus RNA genome, the subgenome promoter region and a 3'-terminal portion of said Semliki Forest Virus RNA genome.
 4. The recombinant RNA of claim 1, wherein the exogenous RNA sequence encodes a peptide or protein and is inserted into the subgenomic 26S RNA of Semliki Forest Virus.
 5. A composition comprising the recombinant RNA of claim 1 contained in a particle comprising an alphavirus nucleocapsid and a surrounding membrane, wherein said membrane includes an alphavirus spike protein.
 6. A recombinant RNA according to claim 1 having a length effective for packaging into an infectious viral particle comprising wild-type alphavirus structural proteins.
 7. A DNA vector comprising a cDNA having one strand complementary to the recombinant RNA of claim
 1. 8. The recombinant RNA of claim 1, wherein said exogenous RNA sequence encodes a protein and said biological function is expression of biologically active protein.
 9. The recombinant RNA of claim 4, wherein said exogenous RNA sequence is inserted into a portion of the 26S subgenomic RNA selected from the group consisting of a portion of the capsid protein RNA, the p62 RNA, the 6K RNA and the E1 RNA.
 10. The recombinant RNA of claim 4, wherein the exogenous RNA sequence encodes a foreign viral epitopic peptide and is inserted into the portion of the Semliki Forest Virus genome encoding the E2 spike protein precursor subunit.
 11. A recombinant RNA according to claim 6, wherein said alphavirus structural proteins include all of the nucleocapsid, p62, 6k and E1 proteins of Semliki Forest Virus.
 12. A DNA vector of claim 7, further comprising a promoter for transcription of RNA operatively linked to said cDNA such that transcription of said cDNA produces a recombinant RNA molecule which can be efficiently translated and replicated in an animal host cell, said recombinant RNA comprising a Semliki Forest Virus RNA genome and an exogenous RNA sequence, wherein said exogenous RNA sequence is operatively inserted into a region of the alphavirus RNA genome which is non-essential to replication of the recombinant RNA molecule such that the exogenous RNA is expressed from the Semliki Forest Virus transcriptional promoter when said DNA vector is introduced into a host cell and further such that the exogenous RNA expresses its biological function in said host cell.
 13. A cell containing a DNA vector according to claim
 7. 14. A DNA vector of claim 12, wherein said promoter is an SP6 promoter and said cDNA is located immediately downstream of the SP6 promoter and further wherein said cDNA has a 5'-terminal sequence of ATGG or GATGG and a 3'terminal sequence of TTTCCA₆₉ ACTAGT.
 15. A DNA vector according to claim 12, wherein a portion of said cDNA encoding an alphavirus structural protein is deleted, and further comprising a polylinker, wherein said polylinker is composed of DNA having a nucleotide sequence containing a plurality of restriction enzyme recognition sites.
 16. A DNA vector according to claim 12, wherein said cDNA contains a mutation in the region encoding the protease cleavage site in the p62 protein of the Semliki Forest Virus, wherein said mutation results in expression of a p62 protein of Semliki Forest Virus that is not clearable by intracellular proteases endogenous to said host cell.
 17. An RNA molecule made by transcription of a DNA vector of claim
 12. 18. A cell containing a DNA vector according to claim
 12. 19. A DNA vector according to claim 15, wherein said polylinker is operatively linked to said cDNA so as to allow expression of DNA encoding an exogenous protein in a host cell transformed with said DNA vector.
 20. A DNA vector according to claim 15 wherein said polylinker is inserted into the region of the cDNA encoding the p62 spike protein.
 21. A DNA vector according to claim 19, wherein said restriction enzyme recognition sites are sites for the enzymes BamHI, SmaI and XmaI.
 22. A DNA vector according to claim 19, wherein said polylinker is operatively linked to said cDNA so as to allow expression of DNA encoding an exogenous protein as a part of an alphavirus structural protein.
 23. A DNA vector of claim 16, wherein the cell-entry activity of said p62 protein can be activated by treatment with a protease in vitro.
 24. A cell containing a DNA vector according to claim
 16. 25. A DNA vector of claim 23, wherein said protease is trypsin or chymotrypsin.
 26. A DNA vector of claim 23, which is selected from the group consisting of pSFV1, pSFV2 and pSFV3.
 27. A helper vector comprising a cDNA encoding an alphavirus RNA which expresses at least one alphavirus structural protein and wherein said alphavirus RNA lacks sequences encoding RNA signals for packaging of RNA into alphavirus particles, but contains the 5' and 3' nucleotides needed for replication of the alphavirus RNA in a host cell and also contains nucleotides encoding a promoter for transcription of said DNA encoding said alphavirus structural protein in said host cell.
 28. A helper vector of claim 27, wherein the nucleotides needed for replication are the replication sequences from Semliki Forest Virus and the structural protein sequences and promoter sequences are encoded by and direct transcription of, respectively, the Semliki Forest Virus 26S mRNA.
 29. A helper vector of claim 27, wherein said cDNA comprises the nucleotides 1 to 308, inclusive, and 6400 to 11517, inclusive, of Sequence I.D. No.
 1. 30. A helper vector of claim 27, wherein said structural protein is functionally homologous to a protein selected from the group consisting of the nucleocapsid, p62, 6k and E1 proteins of Semliki Forest Virus.
 31. A helper vector of claim 27, wherein said cDNA contains a mutation in the protease cleavage site in the alphavirus structural protein homologous in function to the p62 protein of the Semliki Forest Virus, wherein said mutation results in expression of a p62-homologous protein that is not cleavable by intracellular proteases endogenous to said host cell.
 32. A helper vector of claim 27, wherein said structural protein is the p62 protein of Semliki Forest Virus.
 33. A cell containing a helper vector according to claim
 27. 34. The helper vector of claim 28, wherein said cDNA comprises the nucleotides 1 to 308, inclusive, and 6400 to 11517, inclusive, of Sequence I.D. No.
 1. 35. A helper vector of claim 28, wherein said structural protein is a protein selected from the group consisting of the nucleocapsid, p62, 6k and E1 proteins.
 36. A helper vector of claim 28, wherein said cDNA contains a mutation in the protease cleavage site in the p62 protein, wherein said mutation results in expression of a p62 protein that is not cleavable by intracellular proteases endogenous to said host cell.
 37. A cell containing a helper vector according to claim
 31. 38. A helper vector of claim 32, wherein said cDNA contains a mutation in the protease cleavage site in the p62 protein, wherein said mutation results in expression of a p62 protein that is not cleavable by intracellular proteases endogenous to said host cell.
 39. A method for producing recombinant alphavirus particles containing a recombinant alphavirus genome, comprising transfecting a host cell with;a first vector comprising a cDNA encoding an alphavirus RNA which expresses at least one alphavirus structural protein and wherein said alphavirus RNA lacks sequences encoding RNA signals for packaging of RNA into alphavirus nucleocapsid particles, but contains the 5' and 3' nucleotides needed for replication of the alphavirus RNA in a host cell and also contains nucleotides encoding a promoter for transcription of said DNA encoding said alphavirus structural protein in said host cell; and a second DNA vector comprising a cDNA encoding a recombinant alphavirus RNA genome, wherein said recombinant alphavirus RNA genome contains at least one deletion or stop codon mutation in the region encoding said structural protein encoded by said first vector, such that said structural protein that is encoded by said first vector cannot be made upon introduction of said second DNA vector into said host cell, and encoding all other structural proteins needed for assembly of an alphavirus particle, so that said other structural proteins are expressed in said host cell, and further wherein an exogenous RNA sequence, encoding said exogenous protein, is operatively inserted into a region of the recombinant alphavirus RNA genome which is non-essential to replication of the recombinant alphavirus RNA genome such that the exogenous RNA is capable of expressing said exogenous protein in said host cell; allowing assembly of said recombinant alphavirus particles from structural proteins expressed by said first and second vectors; and recovering said recombinant alphavirus particles from cultures of said host cell.
 40. A cell according to claim 18, which is a stably transformed animal cell.
 41. A cell according to claim 24, which is a stably transformed animal cell.
 42. A cell according to claim 40, wherein said animal cell is a BHK cell.
 43. A cell according to claim 41, wherein said animal cell is a BHK cell.
 44. A chimeric alphavirus comprising an alphavirus structural protein containing an amino acid sequence which is exogenous to said alphavirus and wherein said structural protein containing said exogenous amino acid sequence is packaged into chimeric viral particle.
 45. A chimeric alphavirus according to claim 44, wherein said exogenous amino acid sequence is contained within a structural protein that is functionally homologous to the p62 protein of Semliki Forest Virus.
 46. The chimeric virus of claim 44, wherein said alphavirus is Semliki Forest Virus.
 47. A chimeric alphavirus according to claim 45 wherein said structural protein is the p62 protein of Semliki Forest Virus.
 48. A chimeric alphavirus according to claim 47, wherein said alphavirus is Semliki Forest Virus. 